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SYNOPSIS 


In this thesis we have considered the problem of detection 
and identification of multiple outliers in a fixed effects 
linear model. An outlier is an observation that deviates 
from the rest of the observations in some sense. Most of 
the work in this field is done when a single outlier is 
present in the data. However^ it is reasonable to suspect 
more than one outlier/ when the number of observations is not 
too small. Some persons have recommended a sequential 
procedure for, multiple outlier case. But this has the 
drawback of|^/4asking effect, that is, when more than one 
outlier is present, a test for one outlier may not detect 
even a single outlier. Hence a block procedure for testing 
two or more outliers is proposed. 

We consider a general linear model/ that is Y is 

distributed as normal with mean and variance-covariance 

matrix a I (N(xg^,a I)), where Y is the n-component vector 

of random variables, 3 is a m-component vector of unknown 
2 

parameters, a is the unknown variance of each Y^, X is the 
known design matrix of order nxm and of rank k(k £ m < n) . 

The residual vector and the residual sum of squares for this 
model are e = a y and S = y' A y, 

‘ i I f 

where A = ( ( k . . ) ) = fl - x( X' x) “ X' 1 is an idempct ent matrix 

CO ijl CO CO <SJ CO O 

of rank n-k. y is the realization of Y, and (X'^X) is any 
generalized inverse of X'X, 



2 

Suppose is an independent root mean square estimator 

2 

of a based on 2 ^ degrees of freedom. Denote the pooled sum 
of squares based on p = n-k+v degrees of freedom by 
Sp = + V Sy. Define 

^ ^ ~ 1/2/.. ./n> 

as the weighted residuals. We propose the test statistics 
for two outliers as 


U = Max 12-.^/ 
1 < i< j < n ^ 


and 


V = Max 1 V . . i , 

l<i<j<n 


where for i j. 


u . . 

13 


( w^+Wj )/[ 2 ( 1 + ) ] 

V.J = (w-Wj)/[2(l-Pj^j)] 


1/2 

1/2 


and 




1/2 


The joint distribution of these as well as 

■J- J -J 

is obtained. From this the marginal density of u^j is deduced. 
This is given by 

f{u^j) = (1 *u^^.)^P'"^^-^Vb[1/2/(p-1)/2], -1 < < 1. 

Similarly, the joint probability density function (pdf) 
of u. . and u. . is given by 


P'"2 1,2 2 

22t(l-pf)^/^ 1-P“ ^ hh 


(1) f(u. ./U. . ) 




(p~4)/2 



inside the ellipse 


2 2 „ „ 1 r.2 

u . . + u. . - 2Pi u. . u. . = 1~P^ , 
ij ijji 1 ij 1 

where is the shape parameter given by 


Pi = ^ i V[2{ (1 + P. ^)0-+Pj • 

X ill iJi ii j JJl 13 

The marginal pdf of Vij is exactly same as that of 
Expression for the joint pdf of v. . and v. . is analogous to 

13 ai3i 

equation (1) with shape parameter P^ given by 


Since the marginal distributions of u, . and v . . do not 

1 J i J 

depend on i and j, hence the first Bonferroni inequality is 
useful for evaluating nominal upper percentile points of 
these statistics. We have also derived a recurrence relation 
for the evaluation of bivariate probabilities like 
Pr (u. . > h^ u . . > k)/ as this is required for obtaining 

13 i^li 

bounds for the exact percentile points. 


As an application, we consider a random sample from 
2 

N(jui,a ) as a special case of the general regression model. 

2 

Now Yi*Y 2* * * * ''^n constitute a random sample from a N(p.,a ), 
then for 2 ^ =0, u. . and v. • reduce to 

i J -L J 

Uij = [n/{2(n-l)(n-2)}]^'^^ (Yi+Yj “ 2y)/s 

= Cl/{2(n-l)}^/^] (yi"-yj)/s. 


and 



09 0^ "“vS 

where y = £ Yj/h/ s = S /(n-l) and S = S (y^-y) . 

i=l i=l 

This immediately gives 
U = [n/{2(n-.l)(n-2)}]^/2 
and V = [l/£2(n-l)}^/^] , 

where T^^^ = ^^(n) ^{n-1 ) Murphy's statistic 

and Tjjg = (y^j^j - internally studentized range 

statistic, and y^^^ i Y(2) ~ •** ~ ^(n) order statistics 

obtained from y;j_ / Y 2 # • • • /y^" Both these statistics have been 
used for the detection of two outliers. 

In this case, there are just two values of shape parameter 

for the joint distribution of u. . and u. . . These are given 

ij 

by = (n-4)/[2(n-2)] and = -2/(n-2) with respective 

freguencies n(n— 1 ) ( n-2 )/2 and n(n— 1) (n— 2) (n— 3)/8. For the 

joint distribution of v. . and v. . , the shape parameter can 

ij 

take three values, viz., -0*5, 0 and 0,5 with respective 
frequencies n(n~l) (n-2)/6, n( n-l) ( n-2 ) (n-3)/8 and n(n-l) (n-2 )/3* 

It is shown that the nominal upper percentage points of U 
and V for some small values of n and V are exact. 

Using the relations between U and and V and 

we obtain nominal percentile points of Tj^^ ^^d These 

are compared with the existing tabulated values. These points 
are in considerable agreement with the tabulated points for 



n < 50, For larger n also, the deviation is not much. A 
lower bound for the type I error probability is also calculated 
with the help of the second Bonferroni inequality by evaluating 
the bivariate probabilities. An approximate percentage point 
for the Murphy's test statistic also obtained. By 

comparing the approximate values with existing tabulated values, 
we find that the approximation is remarkably good. We, 
therefore, recommend that the approximate values could be 
used whenever the exact percentage points are not available, ‘ 
especially for large values of n. 

We next consider the case of a two way layout with r rows 

and c columns and one observation in each cell. Total number 

of distinct correlation matrices of order 4 for 3x3 and 4x5 

tables are enumerated. The total number of distinct shape 

parameters for 3x3, 4x5 and 5x6 tables along with their 

respective frequencies are also counted. For the joint 

distribution of u. ^ and u . . . . the total number of 

^3-’3' 4-'4 

shape parameters is 40. Their number for 3x3, 4x5 and 5x6 
layouts is 11,37 and 40 respectively, since several of these 
are equal or are non-existent due to special values of r and c. 
Similarly, for the joint distribution of v. . ^ ^ and v. . . - 

the maximiim number of distinct shape parameters is 43, Their 
number for 3x3, 4x5 and 5x6 layouts is 13, 39 and 43 respectively. 



These are then used for obtaining bounds for the exact 
percentile points of U and V statistics. Exact percentage 
points are also obtained by Monte Carlo method for some 
combinations of r and c. Nominal percentage points compare 
favourably with these values. 


For studying the performance of the proposed statistics, 
we assume exactly two outliers are present in the data. The 
null hypothesis is that there is no outlier and, the alternative 
hypothesis suitable for the statistic U is the union of (2) 
hypotheses H.., l<i<j<n. Without loss of generality we 
take ^ 2 * 'where Hj^2 states that yj_ and y2 have a mean shifted 
to the right by amounts 0 ^ and ©2 respectively. The exact 
density function of u^j^j under derived. Our measure of 

performance is P^2 = ^ gives the 

probability that u^^ is significantly large under ^'^2* Some 
other measures are also studied. 


The approximate distributions of u^^j and v^j under the 
alternative hypothesis are also obtained. The approximate 
distribution does not give satisfactory results for evaluating 
appropriate measures of performance for small values of 6 ^ 
and ©2 for the statistic U, Consequently, we use exact 
distribution when ©]_ and ©2 are small. In comparison to the 
sequential procedure, our procedure performs better, when 
there are two outliers in the data. 



Extensions for more than two outliers for the statistic 
U are given. This statistic for m^(> 2) outliers is given by 


U = 


Max 


u , 


l<i^<i 2 < . . .<i^ <n 




where 


u 


ij^ t ^2 ^ • • • # 


m. 


mi mi 

= (w. -1-w. )/(m.-i+2 E S P. . ) 

H ^2 \ g=l h=l ^g^h 

i < i, 
g h 


1/2 


This again reduces to Murphy's statistic for a random 

2 

sample of size n from a N(jLL,a ). This statistic u. . . 

' 2 ' ' * " 

has the same univariate density as that of u. . for the two ^ 

^ 3 

outlier case. For studying the performance of this statistic 
( ^ ) alternative hypotheses have to be considered. The null 
and the non-null distributions are determined/ analogous to 
that of two outlier case. Performance of this statistic is 
studied briefly. Such immediate extensions of V do not seem 


to hold 



CHAPTER I 


INTRODUCTION AND SUMMARY 

1.1 ^.Scope 

The problems of multiple outliers in a fixed effect 
linear model are considered in this thesis. Outliers are those 
observations in a sample which deviate in some sense from rest 
of the observations. Usually these deviations are in mean or 
in variance or in both. Here we are concerned v/ith test 
statistics designed to be sensitive to various non-null patterns/ 
primarily a shift in mean of two or more variates when the 
variance is unknown. The following topics are studied in this 
thesis* 

( i) Detection of two outliers in a general linear model. 

( ii) Application to a random sample from NOi/d ), 

( iii) Application to a two-way layout, 

( iv) Performance of the statistics, 

(v) Extensions for more than two outliers. 

These topics are discussed in detail with suitable tables 
to support the theory# wherever necessary. The following 
sections give an outline of what is covered under these topics, 

1,2, Two outliers in a general linear model 

Recent work in outlier detection and testing discordancy 
in a general linear model is based on residuals# standardized 



2 


in some way. Anscombe (1961) ^ Anscombe and Tukey (1963) discuss 
the analysis of residuals and give a detailed presentation of 
their potential usefulness. Absolute studentized residuals and 
related statistics for the detection of a single outlier in a 
general linear model have been considered by several authors, 
for example, see Srikantan (1961), Stefansky (1971), Joshi (1972, 
1975), Ellanberg (1973, 1976), Lund (1975), Prescott (1975) 
and Gentle (1978). Barnett and Lewis (1978, Ch, 7), Kale (1979), 
Hawkins (1980, Ch. 7), David (1981, Section 8.6) and Beckman and 
Cook (1983) have done excellent survey work in this field. More 
recent work is done by Cook and Prescott (1981) and Doombos 
(1930/ 1931). All these authors have mainly concentrated on a 
single outlier, with some also considering sequential and other 
procedures for two or more outliers. Their work is mainly based 
on the assumption that there is at most one outlier in the given 
data set, an assumption which is reasonable when the number 
of observations is small. However, it is reasonable to suspect 
more than one outlying obseirvations, when the number of observation 
is not too small. Some authors, for example, Anscombe (1960), 

John and Draper (1978), Gentleman (1980) etc. have recommended 
a sequential approach for detecting more than one outlier. 

However, in the special case of a random sample from a N(it,a ) 
distribution, with two outliers on the right, McMillan (1971) 
and Moran and McMillan (1973) have shown that the performance 
of such tests is inferior to that of Murphy's test (Murphy, 1951) 


for two outliers 



3 


In genejral the total number of outliers present in any 
given data is unknown. Hox^rever, there are situations where 
one feels that a specified number of observations are outliers, 
for example, there may be a sudden drop in output of two machines 
out of n rnachixies in a cloth manufacturing factory. If there 
are two outlying observations, then a statistic which can detect 
both the outliers simultaneously is preferred over a sequential 
procedure, since it avoids the masking effect. By masking, it 
is meant that "extreme'' observations are not declared as outliers 
because some other observations are also outliers. Thus if there 
are two outliers in a data set and we test for one outlier, then 
we may not be able to detect any outlier due to the presence 
of second outlier. This phenomenon is discussed by Pearson and 
Chandra Sekar (1936), McMillan and David (1971), McMillan (1971) 
etc. With similar motivation, we consider block procedures for 
testing two outliers in linear models. 

Let Y^,Y 2 , . ..,Yj^ be n independently and normally distributed 
random variables which have a linear regression on a known set 
of m variables. Then the fixed effects linear model can be 
described as 

( l.P .1) Y i N(X3, a^I) , 

fN) ro fsi 

where the symbol = stands for "is distributed according 

to" , 
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— w 


— 




h 


h 



^12 


Y = 

h 

f IB = 


; X = 

^1 

^2 



• 

• 


1 • 

«N> 

• 




1 • 

Y 

n 


i 


• 

« 

^nl 

^n2 

» • # X 1 

nxnj 


I is the identity matrix of order and 0 and are tlie 

fO 

unknown parameters. We also assxame that m < n, that is, there 


are more observations than the number of unknown parameters 
in 0, 


Let the rank of the known des 
y stand for the realization of Y. 
this model are givenby 


ign matrix X be k < m, and 

to 

Then the normal equations for 


X' X 3 = X' y. 

ro ro 'iro ro cm 

Let (x'x) denote any generalized inverse of (X'x) 
satisfying (x'x) (x'x)‘*( x'x) = x'x, Rao (1973, p, 24), Then one 
solution of the normal equations is 

S' = (X'X)” X'y. 

ro co c\> fs> <v> 


The estimated value of y for this model is 
y = xS = X(X'X)'’ X'y 

l\> CO rvJ fs> fo 

and the residual vector is 


(1.2.2) e=y-y=[l~ X(X'X)~X'] y = a y, 

ro cvai IN> to to to to to to to 

where 

(1.2.3) A = I ~ X('X'X)“ X' = ((X. .)) 

to to to to to to 
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is a real, symmetric and idempotent matrix of rank (n-k) 
satisfying a X = O. The residual vector e has a singular normal 

csj eo 

2 

distribution N(0,AC7 ), Further, the residual sum of square 

CO 

(1.7.4) = e^e = y' A y 

CO (fO CO fO CO 

2 2 

has a central o X ' distribution with (n-k) degrees of freedom. 

Let s^ be a root mean square estimator of O based on V 

2 2 7 

degrees of freedom, which is independent of y and S = S + vs“ 

«v) P ^ 

be the pooled sum of squares based on p = n~k+V degrees of 
freedom. Define 

( 1 . 2 . 5 ) = ®i/ ^ ^i^ ^ , 2 /...,n 

as weighted residuals. The joint destribution of these residuals 
has been discussed by Joshi (1972, 1975). This is generalized 
by Ellenberg (1973) for the non-singular joint distribution of 
( wj_ , , . . , Wg) . An ingenious method for obtaining these distributions 
is also given by Margolin (1977) . 

In Chapter II, using linear combinations of these 
residuals, we suggest some test statistics for detection of two 
outliers. We denote the one-sided statistic by U and the two- 
sided statistic by V. These are given by 

U = Max u . . 
l<i<j<n 

and V = Max 1^. ^1 ■ 

l<i<j<n 

where for i j,wG assu-ma ^ ^ / Ani th-e m 3 * i-m-uta occ 

£ox a singl® pair 
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(1.2.6) = (Wj_+Wj)/[2(1+Pj_j) 

(1.2.7) v^j = (wj^-Wj)/[2(l- j) , and 


^ij 


1/2 


is the correlation coefficient between the residuals e. and e . . 

X J 

The one-sided statistic U can be used for two outliers on 
the right side. For two outliers on left, a suitable test 
statistic is 


U. = “Min u . .. 
l<i<j<n 

The distribution properties of are analogous to that of U, 
Hence it is sufficient to study U for two outliers on the right* 
In order to apply these tests we require the exact null 
distribution of these statistics when there are no outliers* This 
is extremely complicated and among other things, depends on the 
design matrix X. However, the marginal distribution of u. . and 

CO xj 

v^j are identical. The common distribution is given by 

f(u^2^ = (l-u^2^^^~^^^^VB[l/2,(p“l)/2] , “1 < u^2 - 

Using first Bonferrtni inequality this allows us to obtain 
an upper bound for the true upper percentage point. This gives 
nominal upper percentage point, which controls the probability 
of type I error. A lower bound is obtained by considering the 
joint distribution of in some special cases. This requires 

the evaluation of bivariate probabilities like Pr(u . .>c,u >c) , 

i'j' 

In this regard we have provided some recurrence relations for 
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evaluation of such bivariate probability terms. 

A two sided discordancy test for roj^ outliers (irrespective 

2 

of directions) in a random sample from N(/j,,a ) is studied by 
Tietjen and Moore (1972). Their statistic called the "largest 
gap" / which is denoted as by Barnett and Lewis (1978)# 


is given by 


n~m. 


( 1 . 2 . 8 ) a. ~ ^ ^ \ 

j 


_ 2 

N16 , ^ (j) n-mj^ ' (j) 


h 


~ r)^ . 


Here r, ...=lyj~yi/ the absolute deviation of y. from the sampl( 
V 3 ' 

mean y;{r^^j} are the valtxes of rj in ascending order# 
r^j < ... < r is the mean of all the and 


'n-m^ 


'au ^2) + ••• + 




For the general linear model considered above# an obvious 
generalization using weighted residuals is as follows. Let 


it 

U 

iWil, < 

"^(2) 


n 


mm 

r*'*" = 

S rtj/ n and 

i=l 


Then 




(n) 


■i ‘ 


n-m^ 


T 


me 


1 n 

E (r’;' - rj' )V E (r^' - r'")^ 

]'=1 ( J ) ”-"'1 1=1 ( 3 ) 


can be used for detection of m^^ outliers when the direction is 

2 

unknown. Note that for a random sample of size n from N(/i#a )# 
we have ■“ Y 


Xii =s X = (n-l)/n for all i = l#.,,#n. 
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Consequently 






n-m^' p 


4 


N16* 


Because of the modulus sign involved in this statistic, it 
is not easily comprehensible and is very complicated to deal with, 
Also as pointed out by David (1981, p. 240), even does not 

necessarily pick up the m^ most outlying observations, 

Shapiro and VJilk (1965) have also proposed a statistic for 

2 

random samples from N(jU«d’), which we denote again in Barnett 
and Lewis's (1978) notation as 'Ihis is given by 


2 2 

^N17 “ ^ ^n,n-i+l ^y(n-i+l)*’y( i) ' 

where [n/2] denotes the integer part of n/2# a^ ^ are tabulated 
constants, and — ^(2) — *'* ~^(n) ordered observations 

yi,y^, ...,y^. 

For linear models one can use a generalized statistic 


[n/2] 2 

’Sit = ^i,n-l+l [“(n-l+l) - ''(i)] ' 

where w^j^^ ~ ^(2) - •'* - ^(n) ordered 

# „ . are suitable constants, 
n . n— 1+1 


,,Wjj and 
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These statistics and have not been studied in 

complete detail so far* The generalizations and will 

require extensive calculations even for well behaved patterned 
design matrices X* For this reason, we have not considered such 
statistics and have concentrated on U and V statistics only, 

1*3. Application to a random sample from a N(M*a ) 

In Chapter III we consider a randan sample ¥ 1 *^ 2 * ' ‘ 

2 

of size n from a nornal population X'jith mean ij, and variance a • 
Let 

y(i) ^ ^2) - ••• - y(n) 

be the order statistics, obtained by arranging Yi'y2'***'^n 
an ascending order of magnitude. 

This can be considered as a special case of the general 
regression model and the test statistics U and V can be obtained 
from the regression model case. For v = 0, the one-sided 
statistic U can be compared with the Murphy's statistic M for 
two outliers# which is given by 

(1.3.1) M = (y^^) + - 2y)/S 

where 

(1.3.2) S'" = s (yjl - y) - 

i=l 

The two-^sided statistic V can be compared with internally 
studentized range 

(1.3.3) = ^^(n) “ ^(1)^'^® ' 
where s^ = S^/(n-l), 
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We show that for v =0, our statistics U and V, when 
multiplied with suitable constants reduce to M and Tj^g 
respectively. Using nominal percentile points of U and V, 
nominal critical values of M and are calculated. These 

nominal critical values of M are compared with the exact values 
tabulated by Hawkins (19 78) for Murphy's test. The Tj^g values 
are compared with the simulated values obtained by Barnett and 
Lewis (1973). It is observed that the nominal percentage points 
are reasonable for n < 50. 

Vie also obtain a lower bound for type I error probability 
using second Bonferroni inequality. These are tabulated for 
different values of n and significance level a. 

Finally, a method for finding approximate upper percentiles 
of Murphy's test statistic for two outliers is discussed. Comparing 
with the tabulated va3.ues, it is observed that the approximation 
is romarkably good for all values of n, 

1,4, Application to a two«-wav layout 

Tests of multiple outliers in a two way-^layout have _ been 
discussed by several authors like Gentleman and Wilk (1975a, b), 

John and Draper (1978), Bradu and Hawkins (1982) etc. Some of 
them have used residuals while others have used tetrads etc. 

In Chapter IV we apply statistics U and V for the detection 
of two outliers in a two-way table having a single observation 


in each cell 
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In this chapter we mainly analyse the * shape parameters' 
which appear in the joint distribution of two or two 

These shape parameters are then used for finding bounds for 
actual percentile points. For comparison purposes# actual 
percentile points are also obtained by Monte Carlo method for 
some Special cases. 

1.5. Performance of the statistics 

In Chapter V we study the performance of test statistics 
in non-null situation when two outliers are present. We now 
assume that exactly two outliers are present. The null hypothesis 
for testing for two outliers specifies that there are no outliers. 

To evaluate the performance of the statistic U the 
alternative hypothesis is the union of hypotheses 

^ 15 , Without loss of generality we take 

For testing two outliers on right# the model under 

E(y) = X 0 + s. e. + e # 

cv> (fO ■*. fO 

where (s =l#2#...#n) is the sth colximn of I„ and 0. > 0# 
i = 1#2. 

For determing the performance of the two-sided statistic 
V# the model under given by 

E(y) = X 0 - e. 0. + e_ e, ; e. # 0 , > 0 . 

4^ <s> fvj X /C X 41. 

In this case the alternative hypothesis is the union of 


n(n-l) such hypotheses 
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We have obtained the exact and approximate distribution 
of u^j^s and under the alternative hypothesis. For U, 

the appix>ximate method does not give satisfactory results for 
small values of and . Consequently/ its performance is 
evaluated with the exact density function for small ©^ and • 

For large 0^/02 evaluate it approximately also. Let 

_ pj. and 

^ Pr(iv^j| > 

We use these and related quantities for studying the 
performance of proposed test statistics. These measures are 
calculated for random samples from N(ju,/d ) for several values 
of n. 

Similarly this is carried over for two way tables also. 
We first determine the cells which would give a minimum 
probability of identification of outliers/ if they are present 
in them. Then we evaluate the measures used for studying the 
performance. 

For comparison purposes we consider the sequential 
procedure. The possibility of testing multiole outliers 
sequentially for discordancy has been mentioned by a number 
of authors like Pearson and Chandra Sekar (1936), Dixon (1953), 
Tietjen and Moore (l972) and David (1931, p, 239). The 
sequential procedure with other block procedures has been 
compared by McMillan (1971), McMillan and David (1971), and 
Mr^-r-an and McMillan (1973). 
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We compare our procedure in case of a random sample from 
normal distribution v/ith that of sequential procedure whose 
exact performance values are obtained by Moran and MisWillan 
(19 73). For two way tables we consider the sequential procedure 
suggested by Anscombe (1960) . For all these cases our procedure 
performs better than the sequential procedure.. 

1 .6« Extension for more than two-outlier cases. 

The one-sided statistic U has been extended for more 

2 

than two outliers, which from a random sample for N(/X,ci ) is 
equivalent to Murphy's test statistic. Distributions in the 
null case are analogous to that of two-outlier case. We have 
tabulated the nominal percentile points for three-outlier case. 

For studying the performance of the statistic for 3 
outliers on right, we have alternative hypotheses, 

Hhij (1 < h < i < j 5, ' where under 

S I + ®1 ®1 + ®2 + ^3 ®3 ' 

e_ (s = 1,2, .•.,n) is the sth column of I, 6, > 0 (i = 1,2,3) 
and 

Var (y|H. ,)=a^I. 

^ X ^ ^ 

Distribution of test statistic in the non-null case is 
discussed in detail. Again the results are analogous to the 
two-outlier case. We have studied the performance of Murphy's 
test statistic for three-outlier case. It is observed that the 
test for three outliers performs well when outliers of the same 
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magnitude are present. If there are only two outliers, and a 
test for three outliers is applied, then the performance is not 
good for small values of n. But it is reasonable for large 
values of n. However, if there is only one outlier and a test 
for three outliers is applied, then it performs very poorly 
even for large values of n. 

General results for m^^ (> 3) outliers are similar. We also 
note that such extension for three or more outliers are possible 
for the one-sided statistic U only. No immediate extension 
seems to hold for the two-sided case. 

1.7* Notations 

( 

The following notations will be used in this thesis : 

pdf : probability density function, 

d 

= ; IS distributed according to, 

= : is approximately equal to, 

= ; is equivalent to, 

G(a) * f ^ e ^ dx, a > 0, 

o 

B(a,b) s / x^^^d-x)^ ^ dx, a > 0, b > O, 

o 

I (a,b) = / x^ ^ (1-x)^ ^ dx/B(a/b) # 

^ o 

2 2 
N(/i,a ) ; normal distribution with mean 4 and variance 0 , 

9 

: central chi-square distribution with 'a'degrees of 

freedom 

and 

2 

X (a,b) : non-central chi-square distribution with 'a'degrees 

of freedom and non-centrality parameter 'b'. 



CHAPTER II 


TWO OUTLIERS IN A GENERAL LINEAR REGRESSION MODEL 

2«1. In-broduct-ion and test: statistics 

The problem of detecting two outliers in a general linear 
model is considered in this chapter. Let be 

independently and normally distributed random variables and 

realization of these variables. We consider 
the model as described in equation (1.2.1) of Section 1.2. Then 
we have the residual vector e, the variance-covariance matrix 

<o 

2 

of e and the residual sum of squares S as in equations 

<N> 

(1.2,2)# (1.2,3) and (1.2,4) respectively. The random variables 
used for the detection of a single outlier are the weighted 
residuals Wj^ given at equation (1.2.5)# viz. 

WjL = ©^/(Sp X^{^)# i = l#2#...#n# 

where is a pooled sum of squares based on 

p = n-k+jy degrees of freedom. 

The statistics Max Wj and Max IWjI are used for the 

l<i<n lliln 

detection of a single outlier# for example# see Srikantan (1961)# 
Joshi (1972# 1975)# Ellenberg (1973#1976)# Lund (1975)# Cook and 
Prescott (1981)# and Doombos (1981). For the case of two 
outliers# we propose a linear combination of and Wj. The 
proposed statistics generalize the well known Murphy's test and 
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a test based on studentized range for the case of a random sample 

2 

from N(M,a ) population. For i j = l,2#...^n, define and 
Vij by 


(2.1.1) = (w^+ ^j)/[2(l + PjLj)]^'^^ and 




1/2 


where is the correlation coefficient between e^ and ej and 

is given by 


(2.1.2) Plj = 

We now propose the test statistics 


for two outliers as 


U = Max u. . and V = Max lv. .l. 
l<i<j<n l<i<jin 

The statistic U is useful for detecting two outliers on 

right, while V is useful for detecting two outliers on either 

direction. 


For two outliers on left, a suitable test statistic is 
given by 

U- s: -Min u . . • 
l<i<j<n 

The distribution properties of are analogous to that 
of U. Hence we consider only U in detail. 

Thus to detect two outliers, we compute all the u. .'s 

-1 J 

v^j's and find the corresponding U, V and If the statistic 

U exceeds the critical value u^ at a level of significance, 
then the two observations corresponding to the maximum u. . are 
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declared as outliers on the right side? if we suspect that the 
two outliers are present on either end/ then the maxim\am of 
absolute values of would identify themj and for outliers 
on left the observations corresponding to the minimum of u^^'s 
would identify them, 

2,1,1, As an illustration of our procedure/ consider 
the hypothetical data given in Table 2,1,1 of a 4x5 layout given 
by Daniel (i960) and discussed by Bross (1961)/ Mickey ^ al » 
(1967) and Doombos (1981), Let y^^ (i = l,2/,../4 and 
j - 1/2/,, ,,5) denote the observations of this two-way table. 
Residual for (i/j)^ cell will be denoted by e^j (i = 1/2/,, ,/4 
j = 1/2/ , » . /5) /and also by single subscripts, viz. 


TABLE 2.1.1- Hypothetical yields - 


Levels of 

A - 


Levels of B 



' 

1 

2 

3 

4 

5 

1 

35 

29 

25 

19 

22 

2 

32 

29 

29 

25 

20 

3 

37 

34 

30 

25 

29 

4 

40 

36 

20 

35 

29 

The 

corresponding 

residuals are given by 




2 

0 

2 

—4 

0 


-2 

-1 

5 

1 

-3 


-1 

0 

2 

-3 

2 


1 

1 

-9 

6 

1 
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2 

For this data S = 202*0, Further, the variance of each 
2 

residual is where \ = 12/20, The two largest positive 

residuals are 623 and while others are much smaller. 

Consequently, these two are likely to give the maximum value of 
Ufj , It is indeed so, and the value of U ( for v = 0 ) is given 

by 

U = Max u.. = [ l/(s^ Max (e.+e,)/(2+2P. 

l<i<j<n ^ 

= (5+6)/[ (202x0.6)^/^ ( 2 + 2 / 12 )^'^^] 

= 0,6788, 

The observations which give the maximum of ate y 2 3 

and y^ 4 * Similarly, V is given lay 

V= Max 1 V ..1 = [1/(S^ X^^^) 1 Max 1 ( e,-e . )/( 2-2P . . 1 

l<i<j<n P l<i<j<n ^ 

= (6+9)/[(202x0.6)^/^ (2+2/4)^/^] 

= 0,8617. 

The corresponding observations which give rise to V are 
y43 ^44* 

Similarly, the value of is 

U- = -Min u . . = -[l/(S X^/^)] Min ( e,+e .)/C 2+2 p. 

^ l<i<jin " P ■’ l<i<j<n ^ ^ 

_ . ( -9«4) /[ ( 2 02 xD . 6 ) ( 2 + 2/1 2 ) ] 

= 0.8022, 

The observations corresponding to minimum of Uj_/s are y^^^ 
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and y^ 3 » Depending upon the model under consideration/ these 
observations are prime suspected outliers# Thus y 44 

on right side, a.nd on either side and y^^ and y ^2 

left side are the potential outliers# 


2»2« Distribution theory 


We will first obtain the marginal densities of u^^j^s and 
Vj^j's and then consider the joint densities# Without loss of 
generality, we derive the marginal pdf of u ^2 ^12* 

bivariate density of Wj_ and W 2 as defined in (1.2 #5) is given 
by Joshi (1972). 


(2.2.1) g(W]_,W2) 


p-2 


27r(l-P^) 


27172 


(1 




1-P‘ 


where p = p ^2 "th® region of positive density is the interior 

of the ellipse 


of 

and 

where 


~ 2pW]^W2 + ^2 =1-/:?. 

For finding the joint probability density function (pdf) 
and Vj^2' consider the transformation 

u = (w 2 +W 2 )/[ 2 (l+P) 3^"^^ = a (wj^+W 2 ) 

V = (v^-W2)/[2(l-p)]^'^^ « b(w3^-W2), 

a =s 1/[2(1+P)3^'^^ and b = l/[2(l-p)3 . 


The inverse transformation is 


vj^ = (u/a + v/b)/2/ 
w_ = (u/a - v/b)/2 
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and the Jacobian of transformation in absolute value is given by 
IJ1 = I — . -Y- 1 - ^ ^ = l/(2ab) . 


a ( / W 2 ) , 

1 1/ ( 2a) 

l/(2b) 

3 (u,v) . 

" ll/(2a) 

-l/(2b) 


Consequently, the joint pdf of Uj ^2 = ^ ^12 “ ^ 


g(u,v) = 


P-2 


4JT(l-p2)V2 2 4 a 




4'a ^ b" "a 


(|-g,2j](p-4)/2 . 


4 "a h' 

Substituting for a and b and simplifying we get 
(2.2.2) g(u,v) = |^(l-u^-v^)^P~^^/2, u^+v^ < 1. 

Integrating out v, we get the marginal pdf of u as 

(1.u2)1/2 


f \ P“2 

g(u) = / 

-(1-u2)V2 


(l-u^-v^)^P”'^^/^ dv. 
2 1 /2 

Now substituting t = v/(l~u ) ' , we have 


g(u) (l.u2)(P-3)/2 / 

-1 

(2.2.3) = (1-u^)^P“^^/Vb[1/2,(p-1)/2] , -1 < u < 1. 


Due to symmetry in equation (2.2.2) with respect to u and v 
the marginal pdf of v ^2 exactly same as that of u^ 2 - 
should be noted that the marginal pdf of Wj_ is also same as that 
of u ^2 given at equation (2.2.3) • 

For finding the joint pdf of two or more or v^^'s, 

it is possible to proceed on analogous lines. We start with the 
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joint distribution of (wj^irW 2 / . ../Wg) as given by Ellenberg (1973)/ 
and obtain the desired joint distribution by means of suitable 
transformations. However/ the integration of extra variables 
becomes tedious. We therefore proceed on lines similar to 
Ellenberg/ using independence of certain quadratic forms. The 
main result is given in Theorem 2«2*1» We need the following 
lemma. 


bemma 2.2.1. 


Let the residual vector e and the residual sum of squares 

2 

S be as given by equation (1,2,2) and (1.2,4) respectively# 
Define 

where D is a diagonal matrix given by 

D = dlag 

(2.2.4) dad' = R = ((P. .)) 

co J-J 




be the correlation matrix of the residual vector e. Let M be a 
sxn matrix such that C = M R M' is positive definite. Consider 


<s) ^ ro cs.) 


a linear transformation 


T = M z , 

ro ro tro 

sxl sxn nxl 

Then 

2 

(i) T is distributed as s-variate N(0/C a ) variate, 

(ii) sj/a^ is distributed as a XT variate with (n-k-s) degrees 
of freedom/ where 
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S? = - T' C"*^ T 

fO fO 

2 

( iii) S- and T are independently distributed, 

ro 

(3. 2 

Proof : Since e = N(0,A a ) , hence 

cs> to 

z = D e i n(0,dAD* a^), that is N(o,R a^) . 

CO to nrsj fo to ro 4S> 

Consequently# 

T = M z 2 N_(0,C a^) . 

ro CS3 «v> ^ 4S> CO 

sxl sxn nxl 

This distribution is non-singular due to the assumption 
that C is positive definite. 


Next consider the quadratic foim 



II 

z' 

M' c“ 

^ M z 


CO CO CO 

CO 

fO CO 

CO CO 


ss 

CO 

D' M' 

CO CO 

c“^ M D e 

CO CO CO CO 



y' 

CO 

A' D' 

CO <S) 

M' C"^ M D A y 

CO CO CO CO CO CO 

where 


y' 

CO 

By# 

CO CO 

( say) # 

(2.2.5) 

B 


D' M' 

C*"^ M DA 


CO 

CO 

CO CO 

CO ro cj CO 

is a real symmetric 

matrix with 


BA rf'A' D' M' M D AA = B, 

f\»ro tocsjtsico csjfOfOro fo 


Similarly# A B = B. Further 

to «SJ 

B B _ A X>' m' M D AA d' M DA 

c\> is> CO rvi CO CO CO co cofO lO' co coca 

= A D' M' C M D a s B, 

C%)cOCO CO «oco COCOCO 

on using that a is idempotent and D a D' =* R, 

^ fo CO CO CO CO 
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Thus is an idempotent matrix with rank given by 
rank (B) = tr (B) = tr (A D' m' c"^ M DA) 


fv> (s) ro fs) 4N> f>i> ro 


tr (M D A A D' M' c""^) = tr (C C**^) 


ro (TO CO ro cs> 


= tr (I 3 ) = s, 


This at once gives that 


y' B y = iy T* C~^ T S xj (central) 

0 * ^ CO 0*^ ^ <s> (s> S 

since the noncentrality parameter is 


^3' x' B X 3 = —■ 0' x' B A X 0 = 0, as 

0 ro CO CO CO CO 0^ CO co CO ro CO CO 


B = B A and A X = 0, 

CO <CO CO CO CO |0 




The distribution of T' C T can be obtained directly as 

CO ro CO 

well/ for example/ see Rao (1973/ p# 524). 

Next/ consider 
S? = - T' C“^ T 

-L <0 <0 CO 

= - y' B y = y' A y - y'' B y 

COfOCO co coco CO coco 


=s y' (A - B) y. 

CO CO CO <0 

The matrix ( A - B ) is an idempotent matrix, since 

<0 CO 


AB = BA = B and A aud B are idempotent matrices. Further, 

CO fO COCO CO CO ^ 

rank of (A — B) is (n-k-s). 


fO CO 


^ ^2 d _2 

Thus Sj^/a = ^-k~s ' 


where the non- centrality parameter is again zero 
Now, consider the decomposition 
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y'y =: y'(I-A)y + y'(A-.B)y 4- y' By, 

CJ CO ^ tsJ fS> «<J fj 

since n = rank ( I - A) + rank (A - B) + rank (b), hence by Fisher^ 

to isj <N> <s> «V> 

Cochran Theorem it follows that the c^adratic forms appearing 
on the R.H.S, are independent* In particular ~ 
and B y are independent. This is even otherwise obvious, 
since (A~B) B = Ab-B^ = B-.B=0. 

CV>cOco fO isjfsj 

n 

The independence of S- = y'(A — B)y, and T = M D A y is 

4S>CV> <0<N> INK <VrO<%ffk> 

also immediate, since 


MDA(A«B) = MD(A-AB) = MDA-MDB 




ro 

CO ro 

CO 

CO 

CO 

cs> CO CO 

= M 

D 

A - 

M 

D 

A' 

D' 


1 

0 

M DA 

<N> 



<v 

CO 

CO 

CO 

CO 

CO 

CO CO CO 

= M 

D 

A - 

c 

c-1 

M D 

A 



ro 

CO 


CO 

CO 


fO CO 

CO 



= M 

D 

A - 

M 

D 

A 

= 0 

• 



ro 

CO 

<v> 

CO 

CO 

CO 

CO 





This completes the proof of the lanma. 


The notations used in Lemma 2.2*1 are also used in the 
following theorem, i/^ich gives the desired joint pdf. 

Theorem 2.2*1 . Let 3^=3^ +5^^# P = n-k+ 3^ , where s^ is an 
independent root mean scjuare estimator of a. Let 


u ^ = "^i/ ^p * ^ “ 1/2, .••/S, 


where 


T = 


% 


is as in Lemma 2.2*1* Then the joint pdf 


of u' 

fN> 


(U]_/U 2 /.../Ug) is given by 
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f (ui#U2# . .. ,Ug)= [o(p/2)/C3C (p-»)/2i ] I ^P"'®“'2)/2 


CO ro CO 


inside the region 

u' c""^ u < 1, 

CO ro fO 

2 2 2 2 

Proof : Let where is as in LOTma 2.2-1* Since 

Sy is an independent root mean square estimator of ct, hence T 

fO 

2 

and Sy are independent. Consequently/ by Lemma 2.2«1 T and 


are independent, with 

T i N(0/ C and 

CO CO CO 

2 d 2 2 

Without loss of generality we take 0=1, Further 

2 2 2 
Sp = 

- s2 

- 


( 2 . 2 . 6 ) 


,2 

’2 


+ 

T' 

0 

i 

T + Vs 


CO 

ro 

c>o 

+ 

T' 

^-1 

T. 


fO 

CO 



2 


The joint pdf of (uj^/U^, ...,Ug) can now be obtained 
exactly as given by Ellenberg (1973), by considering suitable 

i 

transformations. For this, we start with the joint density of T 


and given by 

lCi"V2 


^ exp [-(T' C“^ T)/2] 

S/ £ cs> CK> ro 


( 271 ) 


2 G(2y^) 


« 3^1 -1 o 

( ) exp ( “^ 2 / 2 ) , 
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where = (p-s)/ 2 # Make a transformation 

and = (sj + C“^ 0^51/2^ 0 < S < «.. 

P CO OJ (PO 

This implies that t^^ = and 

- T' c"*^ T = ~ u' c"*^ U = d-u' c“^ u). 

^ P p Pc^^ CO P CO CO c%> 

The Jacobian of the transformation is given by 


Ul 


3 ( / ^2 ^ / tg ^ ^2} 

3 Sp ) 


Now partitioning .J as 

CO 


J = 

CO 


<^11 


J01 

co2l 


;Jl2 


^22 


Ip 

.2S^u' 

Pco CO 


u 

CO 


2S (1-u^ C"*^ u) 

P CO CO CO 


where J. 


sxs 


= S I 
1x1 sxs 


XJ-1 " “p ;:s ' ;Ji2 = ^ ' 


sxl 


sxl 


2 # ""I 

J., = - 2 S^ u' C 

«'i2X P «NJ «0 

Ixs 1x1 Ixs sxs 


and 


;:22 

1x1 


= 2 S Cl- u' C' 


rl 


P 

1x1 


)i 


^ CO CO CJU 

Ixs sxs sxl 


we get the determinant of J as 


Ul 

CO 


= ‘ill 
= ‘"p ^sl 


'i2 " ^21 ^11 512' 

l 2 S^(l-u' C~^ u )+2 u-' (l/S ) Ul 

Pcococo Pcoco JrifO 


qS+l 
2 ®p • 


Therefore, 
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- ^iiTa- ;^;;;^ s ] 


-sV 2 S 4 . 1 . 

e P . (2S®+^). 


Integrating out S^ from 0 to «»* we get the joint pdf of 

#^2 / • • • 

f ( 112 ^ / *^ 2 ^ • * •/T^g) 


IC 

<0 


- 1/2 


„s /2 Q{y^) 


. 2^--l «> 22^1 +S-1 -Sy2 

d-u' C"^ u) J" ® 


d S 


<v CO <N> 


O 


_ G(V,W2) 

— 3/2 7 \ ^ <s> CO ^ 

jr Giv^) 

Substituting = (p-s)/2# we have 

(2.2.7) f(u) =[G(p/2)/G£(p-s)/2)] n-®/2(^^,0-l„)(p-s-2)/2, 

This completes the proof of the theorem. 
rorr^nainr 2.2.1 . For s = 1, the aistritutlon of 


n 

*= 2 "h.i ^i' 
i=:l 


1 2y^dp-3)/2 „2 < c 

“ lA . /, .' - r r « ''i i =' 

1 C^/2 b[1/2#(p-1)/2] 

where C = M R m' / M = S “ 

Proof s For s = 1/ we have from Lemma 2.2.1# 


T 

P 4 


^ ^ c. /^V2 

t- * M z = E ^i ^i “ .^1 i'^ii ' 

^ <« *- isd. 1=1 
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n 1 

Hence ^ ^ ^ (1.2.5) «. 

^ xzzX ^ i=l 

On applying Theorem 2.2.1# the pdf of is 

G(p/2). 


cl/2 Tr^/2 G{(p-l)/2 } 


(i-uVc)^P”^^^^# i c. 


cl/2 b[i/2#(p-1)/2] 

Hence the corollary is proved. 

For the joint pdf of u. . and u . . # we use equation (2,2.7) with 

ij ^1-^1 


s = 2 and take 

0 0 . . . 

0 0 


M = 


0 « • • • 


1/[2(1+Pij)]^^^ 0 .... 

1/[2(1+Pij)]^^^ 0 

0 .... 0 


0 


....l/[2(l + p. )]■ 

""l^l 

where the non- zero elements occur at j^h and jth positions in the 
first row and i^th and jj|^th positions in the second row. 


Now 


C = M R M' = 

<%> ro «s> 


1 

P. 


where 


hi. ^ p 




* 


(2.2.8) Pi = Pi Ki'%ji> = 2 [(1+p^^) (1 + Pi^j^)]^''" 


tly I Cl = (1 - pj) and ■<*« joint distribution of 


Consequen 


and u. . is given toy 
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(2.2.9) f(u..,u. ) 

j ^2^ J 2^ 

P-2 


•— [ 1 - H-uj j !]‘e>-^’/2 ^ 

2JT(1-P^) l-p£ ^ 


1 _.--2 ...2 
2 
1 

which is defined inside an ellipse 
2.2 „ .. . .2 


u 


^ ^ + u . . - 2pi u. . u. . = 1 - p. . The parameter p. will be 
-^J -L IJ 1 -*• 


refer, red as the * shape '^ parameter of the joint pdf of and 

u. . , because it determines the shape and orientation of this 
ellipse. 


It is useful to note that P.(u.^/U. . ) is identically 

iJ I1J2. 

equal to the product moment correlation coefficient between 


''^ij ~ ‘'ii 




and 


Zi j = /^i^i + ®j ^Y] ' 

To this end, note that with o = 


Var (zj.) = 2(1 + Pj n-) / 
ij ij 


Var ( z . ,• ) = 2(1+P. ^ ) 

^1-^1 

and Cov (z. ./-z. ^- ) = P. . + P, ,• + P^,- + P-: i 

12.^1 


and the result follows immediately. This is useful in evaluating 
the shape parameter in special cases. It may also be noted 
that the joint pdf of u^^^ and is of the same form as that 

of W 2 _ and Wj given at equation (2.2.1). 

The bivariate distribution of two v. / s has exactly the 
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same form as (2.2.9), but in this case the M matrix is taken 

ro 

as 

0 0 ..................... l/foM — n M 1/2 


M 

ro 


1/[2(1-Pj^j)] 


0 


o 


.... 1/[2(1-P. )] 

H-^1 


l/[2(l~p. . 

1/2 


0 .... 

0 . • ■ 


o 


0 


o 


. . . . 0 


the non -zero elements occur at the ith and j th positions in the 
first row and ij^th and jj^th positions in the second row. Further 
the shape parameter P^ is given by 


P. . - P . . - P. ,. + P.. 


(2.2.10) Pi = j^) 


11. 


IJ- 




jj- 


1-^1 


2[(1-P. .)(1-P. . )] 
11 


1/2 • 


Again the shape parameter P^^ is egual to the product 
moment correlation between 

For s = 1/ Corollary 2.2.1 inranediately gives the univariate 

density functions of u . . and v. respectively, as 

11 11 


( 2 . 2 . 11 ) 


Remark ; 


f(u^j) = (1 -u|^.)^P"-^^/Vb[1/2,(p- 1)/2] ,-l < u^j < 1, 

f(Vj^j) = (1 -v|j)^P“^^/Vb[1/2,(p- 1)/2], -1 < v_ < 1. 


Margolin ■(1977) has obtained the Ellenberg's result for 
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the joint density of (w^/W 2 / • . . /Wg) . He has used the follox/ing 
theorem with its corollary for establishing this result. 

Theorem , Consider a set of n random variables Z''”' = ( Zi'V zri: / . . . # z;^') 

to JL Z ii 

whose distribution function has only one parameter 0 > 0, Assume 
further that 

(a) T'' = u(^''') is sufficient for 0, and T"''’ has a gamma 
distribution G(d,0) for d > 0, that is 

f (t'-) = t“''^“^ e"®'*^"VG(d) , t'='‘ > 0. 

T-'-- ■“ 

(b) H is a function from to R such that E. {iH(Z'''')i} exists 

Z"''" 

and is finite for all 6 > 0; and 
( c) S'"'' = ( S5',S'^ ^ , /S';:) is a vector of studentized analogues 

CM ^ 2 . n 

of g''*/ namely for certain strictly positive functions 
h.(T^^) =1/ assume that sj = Z'^'/h.CT'"') (i = l/.../n). Then 

J. J- X JL 

it follows that 

E {H(S“-)} =G(d) E {H(zr"')}, 1] / 

g-;t 

where is the inverse of Laplace transform# that is# 

00 

if { f ( f'-') *0} = g(0) = / e""® f( t'-'-) dt then the inverse 

o 

Laplace transform of g(0) is {g(0)>t®^}. 

Corollary , Assume the conditions of the theorem. In addition 
to that# if the probability density functions f ...and f „ exists# 

z 

to 

then 

(?.,?, 12) f is'-) -Gid) f ,(s''v0)jl}. 

g::- Z'-'" 
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Using these/ Margolin has obtained the marginal pdf of 
w = (wj_/ . /Wg) as 

(2.2.13) f^(w) = [ G{ (n~k-P^)/2}/G{ (n->c-!-^-s)/2l] 

d-w' r"^ (n-k+v~s-2)/2 

CS> <V> ro 

which is same as derived by Ellenberg (1973), 

Possibly, Margolin's results are valid under weaker 

conditions, since he implicitly assumes that the residual sum of 
2 2 

Squares S is sufficient for a , vdiich is not true. Using the 
same argument and applying the results in our case, we get the 
joint pdf of u exactly same as derived in equation (2.2.7). 

2 .3. Nominal percentile points 

The result used in the following lemma is used at several 
places in this thesis. 

Lemma 2.3.1. 

Let z^/Z^ , , . 2j^ be N identically distributed random 
variables with common pdf 

piz^) = (1 - z^)^^ , ^^), *"1 £ 

and Z = Max { Zj_ : i = 1,2,...,NK Then an upper limit for I' 
exact percentile point of Z at.'a level of significance 

for (a/N) < 0.5 is given by the equation 



[(p-l)/2, 1/2] = 2a/N, 
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where I^{a,b) = S C dt is the incomplete beta 

function. The upper litit will be called a nominal upper 
lOOa percent critical point of Z, 


Proof ; Clearly 


N 


Pr(Z > z) = Pr(Max Zj_ > z) = Pr ( (Zj_ > z)) 

i i=l 


Applying the first Bonferroni inequality, we get 


Pr(Z > z) < N Pr( Z]_ > z) . 
Let be the solution of 

N Pr( 23 _ > z) = a . 

Then we immediately get 

Pr(Z > i a , 


and z^ as an upper limit for Thus, for getting z^, we 

solve 


Pr( Z 3 _ > Zg,) = |r . But for § < 0.5 we have > 0 and 
PrCtj > 


“■a 

1 - 

r 

,2 

a 




Cons equently,y solution of 


1 T (HZi 1 ) 
5 ", 2^2 * 2 * 


I , [(p-l)/2/l/2] = 2a/N 

1-4 
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gives the desired upper limit for 

Corollary 2»3.1. If = Max {Iz^l}/ then an upper limit 

l<i<n ^ 

^la true 100a percent point of is given by 

I 2 L (P“l)/2/ 1/2] = a/N . 

Proof : As before, we have 

Pr(2^ > < N PrClZj^l > = 2N Pr (z^ > ^ 

and the result follows. Note that > 0 since N > 1, and 
we do not require that z^^ > 0, 

Upper and lower limits for the true percentage points can 
be obtained by using the Bonferroni (David, 195 6) and other 
inequalities. 

Since U = Max u. an upper bound for true percentile 
l<i<j<n 

point at a level of significance can be obtained by using 
Lemma 2.3.1. With N = (”)/ we immediately get 

(2.3.1) I 2 [ (P“l)/2/l/2] =4a/[n(n-l)] . 

Solution of equation (2.3*1) gives nominal upper percentage 
point This can be obtained either from the tables of 

incomplete beta function prepared by Pearson (1968) or by the 
method described in Appendix I. 

Similarly for v^. the equation to be solved is 
I 2 [<P-l)/2# 1/2] = 2a/[n(n-l)] . 


(2.3.2) 
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Table 7.3,1 gives nominal upper critical values u^ for 
a = 0.005, 0,01, 0.025, 0.05 and 0.10} n = 5(1)12, 14(1)16(2) 
20,21,24,25,27,28(2) 32, 33, 35 , 36, 40,42 ,45 , 48 ( 1 )50(5 ) 60(10)100 
and for k = 1(1) min (n-2,15). The quantity on the R.H.S. of 
(2.3,1), that is 4a/[n(n-l)] is calculated first. Then using 
the inverse of incomplete beta function, which is given in the 
procedure described in Appendix I, the values of and of u^ 

are determined. The values of n for which these calculations 
are done are chosen such that this table would give critical 
values for two-way tables with r rows and c columns, for 
r+c 5. 14. Additional values of n are also included. The same 
table can be used for Vq. also for a = 0,01, 0,02, 0,05, 0,10 and 
0 , 20 ,. 

For comparing the performance of tests based on U and V 
with sequential test, a brief table of nominal upper critical 
values Uq, for (n,k) = (10,1), (11,1), (20,8), (2l,l) and (48,13)? 
and a = 0,02625 is given in Table 2*3.2, For same a, and 
(n,k) * (20,1), (20,8), (48,13)? the Vq, values are given in 
Table 2.3.3. 

As an example for the use of these critical values, we 
again consider Example 2,1,1, There we calculated the value 
of U, V and as 0.6788, 0,8617 and 0,8022 respectively. From 
Table 2.3.1 we find the u^^ value for a = 0,05 > n = 20? k = 8 
is 0,8244 and the v^^ value for the same n,k and a is 0,8465, 

Thus only V exceeds the tabulated value at 5 percent level of 
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significance and hence we decide that there are two outliers on 
either side. 


In general, it is extremely difficult to formulate a 
model that there are two outliers on right, or on left, or one 
on each side. Consequently, we recommend that all these three 
statistics U, and V should be examined for possible outliers, 
A decision to declare outliers can then be based on observed 
significance probability (P-value), For the present example, 
the value of U is considerably small, and hence we calculate 
P-values for and V only, which are equal to 0,092 and 0,029 
respectively. This shove that there are two outliers, one on 
each side and we accordingly declare y^^ and y^^ as the two 
outliers. It may be noted that using a two-sided statistic for 
the detection of single outlier at a significance level a = 0,05, 
Doombos (1981) came to the conclusion that y^^ is an outlying 
observation. 

A lower limit u^^^ can be obtained by considering the 
second Bonferroni inequality and solving for u..,^ in 


(2.3.3) 


(”) Pr(u,j 


u,,^) — r S Pr(uj^j > 




> U;,^) = a. 


where the double svmn is over all distinct terms, vhich appear 
in the second term of Bonferroni inequality. Note that equation 

(2.3.3) may not have a solution for all values pf n and a, 
especially for large values of these quantities. Solution of 

(2.3.3) involves the calculation, of bivariate probabilities. 
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which among other things depend on the shape parameter 
p =P-(u. ./U j ). We discuss this evaluation in next section, 

2«4. Evaluation of bivariate probabilities 


An expression for the bivariate probability Pr(\N^ >h,W 2 > k) 
has been obtained by Joshi (1972/1975) where (w-j_/W 2 ) follow the 
joint distribution given at equation (2.2,1) • Here we intend to 
find a recurrence relation involving p for this function. Since 
the joint pdf of (u. ./U. . ) given at equation (2.2,9) is analogous 

^ J 

to that of (wj_/W 2 )/ hence we continue to use the simpler notation 
of W 2 /W 2 etc, involving only one suffix. Let 


( 2 . 4 , 1 ) 


M(h/k/P/p) = Pr(w^ > h, W 2 > k) 


271 (l-P^)"'^ 1-P 

where the region of integration is given by 


i 


( 2 . 4 . 2 ) W]^ > h/ W 2 > k, w^ - 2P W 2 ^W 2 + W 2 < 1 - 


Define 

1 

Q(a) = Pr(wi > a) = / (1-w^) [ I/ 2 , (p-l)/2] . 

a 

Then the M function has the following properties : 

( I) M(h,k^P#p) - MCk^h^Pjrp) • 

This follows^ because of the symmetry of w^^ and W 2 in 
equation (2#4#1)# 

(II) M(«h/k/P,p) + M(h/k/-p,p) = Q(k) . 
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Ig. 2. A.2. Showing the regions for M(h,k. ftp) and M{-h,-k,ftp). 
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This follows from the figure 2*4,1, oCk) represents the 
probability of a point in the region covered by the segment DB 
and the part of the ellipse DCB, M(-h/k/P,p) is the probability 
of a point lying in the region CABC, Also by symmetry we have/ 
region DACD = region A^B^C^A', Hence the result follows, 

(III) M(-h/-k/P/p) = l-Q(h) ~ Q(k) +M(h,k/p,p). 

This follows from figure 2,4,2* We see that M(-h,-k/P/p) 
is the probability of the region covered by C'A'B''BCC', Suppose 
W is the whole region covered by the ellipse# then 

C'A'B'BCC' = W-E'D'C'E' - D'B'E'D' + E'D'A'E' 

= W-E'D'C'E' - D'B'E'D' + BACB# 

since E'D'A'E' = BACB# by symmetry. Hence# on integrating over 
these regions# we get the required result. Mathematical proofs 
of these results can also be derived easily. 

Prom these relations it is obvious that we have to consider 
only h#k > 0, Let A be the point (h/k) in the (wj#W 2 ) ~plane. 
Without loss of generality we take A to be inside the ellipse 

(2,4.3) w^ -- 2P w^w^ + ^2 = 1 - P^/ 

otherwise the required probability is zero. The region of 
integration is then the shaded area ABCA (see figure 2,4,3), 

Theorem 2.4,1, Consider the figure 2.4.3, 

Let 

M(h#k#P,p) ss Pr (W]^ > h, Wj > k) = Pr[(wj^/W 2 ) e ABCa] # 
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M,(h,k. ftp) a'’'* 
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'd^(h,k,P,p) = Pr [(wj_,W 2 ) e abda] , 

where AD is the line segment passing through the origin and 
the point (h^k) and intersecting the ellipse at the point D, 

Then for all h,k > 0, 

( i) M(h,k,P,p) = M3_(h,k,p,p) + (k,h,p,p) , 

(ii) M^(pk, k,p,p) = T I , [ (p-l)/2/ 1/2] , 

1-k 

( iii) M^(h,k/P/p) = r^(pk,k,p,p)-sign(h-pk)L(k,c,p) # 

where sign (u) = +1 or -1 according as u > 0 or u < 0 respectively, 

(2.4*4) c = lh-Pkl/(h^+k^-2 P hk)^'^^ 
and 

(2.4.5) L(k,c,p) = ~ / TTTo ~ <az* 

o ( 1 - 2 ^)^/^ 1 - 2 ^ 

Proof : ( i) This is an obvious result and is proved in Joshi 

(1975). 

( ii) Joshi (1975) has also obtained expressions for M(h,k,P,p) 
and Mj^(h#k,p,p) as 


(2.4.6) M(h,k,P#p) 


as / 


hk«*(l‘ 




1 h^+k^~2 2hk ^(p--2)/2 


l-2‘ 


and 


(2.4.7) Mi(h,k,p,p) » 

arc tan [kd-P^j’-^Vlh-Plc) coseo^e) m. 

■^arctan[VU-lc^)^/^] 
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Substituting z = -Cos0 in(2*4.7)< we get 
(7.4,8) Rj_(h,k,P,p) 

-(h-Pk)/(h2+k2-2Phk)l/2 ^ (p-2)/2 _ 

" ^( l _ k ^)^/2 1 - z ^ 

Similarly, on interchanging the roles of h and k, we 
obtain M^(k,h,P,p), 

Vjhen h-Pk = 0, that is, h = Pk, eguation (2*4.8) gives 


1^(h,k,P,p) = M^(Pk,k,P,p) 


27T 




1 


dz 


1-z 


= M(0,k,0,p), from equation (2.4.6) 

= Pr(wj|^ >0, p=0»p) 

« Ell / [ (l-w2-w2)^P“^^/^ dwj dw2, 

2" k o 

■here the last equality follows by considering the joint pdf of 
and Wj for P = 0 and Integrating over the region > O.w^ > 

2 a/2 , 

Letting t = 

>,(Pk,k,P.p) 

X jK O 

- E / dw2/B [l/2,(p“l)/2] 

^ k 


J. I 2 ^ (p-l)/2# 1 / 2 ] • 


» ar 2 

l-k 
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( iii) v'Jhen h-Pk < 0, then the upper limit of integral appearing 
on the R.H.S, of (2*4*8) is c,with c given at equation 
('7.'i,4)* Hence c > 0 and 


M^(h,k,p,p) = ~ / 


1 /* _ Jcl_\(p-2)/2 


2n 


■(1 

o 


1-z’' 


^^ 2 ^ 1/2 "^ 1 ^ 2^172 

1 y? s(p-2)/2 

j o ^ TTo *" “■ — 9'' 

^ ^2^1/2 (1-z^)^'^ l-z"' 


dz 


dz 


+ rrr ; 


(1 . j £_)( p - 2)/2 a .. 


2 " o (1 -z2)V2 1 ., 

Prom the result proved at ( ii) above the first term on 
the R.H.S. is equal to M^(pk,k,p/p) . Consequently, 

M 3 _(h,k,P,p) = Pk,k,P,p) +L(k,c,p), 
where Ii(k,c,p) is given at (2.4.5). 

Similarly# when h-pk > 0, we get 


Mj_(h,k#P/p) = M]_(pk,k,pirp) - L(k,c,p). 

This completes the proof of the theorem. 

This theorem shows that in order to evaluate M^(h,k,P/p) 
and hence M(h#k,p,p)/ we need to evaluate L(k,c#p), since the 
incomplete beta functions appearing in these expressions are 
extensively tabulated or can be obtained by the method described 
in Appendix I. For the evaluation of L(k,c#p), we derive a 
recursive relation in the Theorem 2.4.2. 
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Theorem 2 . 4 « 2 » Let k and c be positive and 


L(k,c,p) = ~ (1 - ^ 2 , 


1-2 


then 

L(k,c, 2 ) 


_ 1 _ 

Sin ^ 

“ 271 


i! 

[Sin” 


, z', _ 1 rr.j_"‘l/k^+2c^-l\. 1, ^ . -1 , 2 k'^- ( 1 -t k^ ) ( Irg-X^ 1 

lj(]CfC^3) — Ajj. ^Sin ^ Sin { .2v/-. 2\ 


l - k " 


(l-k^) ( 1 -c^) 


1 -k 

8 


and for p > 3 # 




L(k/C,p) = L(k/C,p-2) 55f- -v-2 ' 2' 


. 1 ^ 0 - O . 2 -- [ l / 2 r ( p «^ V 2 ] 


cV /[( l - c 2 )( l-k )] 


Proof s For p = 2 / we have 


L ( k ^ C | 2 ) “ 271 ^ 


dz 


77-37172 ■ 271 

o V 1—2 J 


= irr Sin"^ (c) , 


Next consider the case p 

*4 \ ">« i ^ » H ii » I 


L(k,c, 3 ) * ^ / 


= 3 , that is 
1 


27 T 72 - 1.^2 


dz. 


2 ^ ; ( 1 - 2 ^)- 
Putting zi = we get z = 

and dz = k’- 


L ( k , c . 3 ) = 5r i . 


k^/(l-c^) ( 1 -z^) — 1 


dz- 


k ^ 


( l - z .)^/^ ( zi - k 2)^/2 
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kV(l-c^') 

4Tf 


, dz. 

JL. r 
4Tr ■'^ 

k^ ^1 


(l-Zj^) dz^ 

z -^ +( 1 4-k^ ) ) 172 


-1 3^ kV(l~c^> 

z, ^2 


where Z = a+bz^ + dz^ with a = -k^ j b = 1+k^ and d = “1* 

Using the tables of standard integrals (for example# see Selby 
and Girling (1965)# p, 307)# we get 


lj(k/C#3) — 


47T 


1 g. -1 r 1 

“ ^,(b2-4ad)^/2J 


kVCl-c^) 


k 1 — sin“^ [ 


-2dz3_-] 


4^ (- d) ^ 


TTa^l 


kVci-<=^) 


(?-4ad^^ k2 


That is# 


. 1 r _ • “*1 / k “'+ 2 C '"1 

L(k#c#3) = [Sin ( 


1-k 


^ V C- -1 , 2k^-(l->-k^)(l~C^1 .1a. 

2 — ■) - k Sin _ jj-r g 


(1-k^) (1-c^) 


Now for p > 3# 


L.(k#c#p) « 


2^^ o 


^ y_ (1 

2072 ^ 


j£^^^(p-2)/2 


1-z" 


2” o d-z") 




. j£_)(p“‘‘V 2 az 


k 


2 ° 1 ,, _ jd.l<P-4V2 

— r ^ ^ TX VX n ' 


2^ o d-z') 


2 W 2 


1-Z 


L(k#c#p*2) - 


where the integral I is given by 
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(2.4.9) I = ^ J sStT" <iz* 

o (X-z^)^/^ 1-z^ 


2 2 

Putting Z]^ = k /(1-z )/ we get 


-r r 

I - 271 -f 


,2 k^/(l-c^) 2 ^/^ 

« X 


k^ dz- 




or 


I = ^ (i_^,(p-4)/2 dz^ . 

k^ 

Letting Zj_— k^ = z^ (l-k^)/ we have 

dz^ = (1-k^) dz2/ l-2j^ = (1-k^)/ and 

^,1-^2,(p-3)/2 [(!-') ,-1/2 


4H 


B(E^ , |) I 2 2 

4 TI: 2 2 

(l-c^)Cl-lc^) 


(1 2 li) 


This completes the proof of the theorem. 


Note that 


I 2 ( 1 / 2 . 1/2) = I Sin“^(a). 


a 


Consequently 4 


Sxn 


rta) 


^ I 2 (1/2, 1 / 2 ) 


If a > 0 


_|- I 2 ( 1 / 2 , 1/2) If a < 0 

3 . 


and it is possible to 


evaluate L(]c,o,p) function of Theorar, 2.4.2 
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in terms of incomplete beta function for all values of p* 

The bivariate probability can also be evaluated by 
numerical integration from equation (2»4«-6), But for smaller 
values of p this recursive method, which essentially involves 
the evaluation of incomplete beta integrals is more accurate. 

Results for the case h = k > 0 . 

We next discuss the important case h = k > 0, This is 
useful in our application as we need bivariate probabilities 
like Pr (ri . . > h, u . . > h) = M(h,h,p.,p), where p - is the 

shape parameter given at equation (2.2.8) • Similarly 

(2.4.10) Pr(lv,.l > h, tv. . 1 > h) s 2 [M(h,h,P',p)+M(h,h,-P',p)] 

where p£ is given at equation (2.2.10). Kiia is required for 

evaluating lower bound. Since the joint pdf of 

identical as that of (w^<'W 2 ) given at equation (2.2.1)/ hence 

using results due to Srikantan (1961), Joshi (1972) etc. we 

have 

( 2 - 4 . 11 ) Pr(v^ > h, W2 > h I P,p) = M(h,h,P,p) = 0 
whenever h > { (l+P)/2}^'^^/ that is, whenever 

(2.4.12) P < 2h^ - 1. 

Similarly, 

1/2 

(2.4.13) Pr(l«il > h, IW 2 I > h) = 0 viienever h > £(l+|PI>/2} . 
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Equation (2*4.11) shows that M(h^h,p^p) = 0 for all 
1 /2 

h H1 + P)/2} • For other non-negative values of h, the 

following theorem gives a systematic evaluation of M(hyh,p,p) 
in terms of incomplete beta functions. 

Theorem 2.4.3 . For 0 < h < { (l+P)/2i^'^^, MCh,h#P,p) is given by 
the following relations ; 


(i) M(h,h,p,2) 

( ii) M(h,h,P,3) 


[Sin"”^p - 
1-h 

4 “■ 2t 


Sin“^(2h^-1)] /27I 

r^. -1 h^-p , c,. -1 

LSxn ^-h Sin 

1-h^ 


4h^-(l-<-h^) (H-P) j 
(1+p) Cl-h^) 


( iii) For p > 3/ 


M(h,h,P,p) = M(h,h,p,p-2) 

- 


2H 


,p-2 Iv 

2 ^ l+P-2h?___ ^ ^ 


(l+p)(l-h^) 


, 1 1 2h^N(p-2)/2 . 

(2.4.14) M(h,h#P,p) = 271 „ 71 57T72 ^“l+z - 

2h^-l ■’ 


Proof ! From equation (2.4,6)/ we have 

P 

r 

2h 

For p « 2/ it immediately gives 

i f r- dz = Sin~^(z)l 

M(h/h,p/2) ^ \ 7 2 )l/2 271 f 

2h^-l ^ 

- 3L. [sin~^(p) *• Sin ^ (2h^“l)]« 

271 L 

For p ^ substituting 


2h^-l 
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in equation (2.4.14) and simplifying, we have 


(2.4.15) M(h,h,p,p) = ^ j 


2hV(l+P) ^1^^!*“^ ^ 

For p = 3, equation (2.4.11) gives 

1 (1-Zj^)^/^ 


T72 


M(h,h,p,p) = ^ / 




2hV(l+P) 

Multiplying the numerator and denominator of the integrand by 
(l~z^) ' and proceeding as for L(k,c,3) in Theoraxi 2.4.2, the 
result for p = 3 follows. Next for p > 3, writing (l—Zj^)^^ 2)/2 
as (1-Z]_) in equation (2.4.15), we get 

(2.4.16) M(h,h,p,p) = M(h,h,P,p-2) - I, 


where 


T h . 1 v(p-4)/2 

^ 2 t: .;r,i72 

2hV(l+P) ' 


Substituting 


Zi-h^ * z„(l-h^), we get 


I « (l~h2)^P“3)/2 j 


2n 


sy2(l-z^)^P“^^'^^ dz^ 


h2(l-p)/[(l-h^)(l+P)] 


S- (l-h^)^P“^^/^ I 


/P~2 1. lx 


2^ l+P-2h^ 

(l-h^)(l + P) 

Substituting in equation (2.4.16) the result follows. 
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2.5. Bounds for bivariate probabilities 

For P < 0, and and C 2 of the same sign, Joshi (1972) 
has shown that 

2 

(2.5.1) Pr (wj_ < c^, W 2 < C 2 ) < (w^ < c^). 

Since the u. /s and v. 's also have exactly the same 
13 ^3 

distribution as w^'s, hence (2.5.1) will hold for them. 

When C 3 _ = C 2 = c is positive (2.5.1) is eguivalent to 


Pr(wi > c, W 2 > c) < n Pr(wi > c) = [Pr(wi > c)] . 

Since 

V 1 ^ ,P-1 Ik 

Pr(WjL > w = 2 ^2 ^ 2 '2^' 

honcc an upper bound of the bivariate probability when P < 0 is 
given by 

ri l\l2 

(5.S.2) Pr(wj^ > o, W 2 > o) <1^ 2 ' f ^ ’ 

ror all values of P , we can use the following ine<jaality 

for Obtaining a staple bound for the bivariate probability, ae 

o-F Cook and Piroscott (1981) • 
argument is essentially same as that of Cook a 

We have 

pr(wi > c, wj > c> £ Pr(wi + Wj > 2o) 

Wi +w_ 2C n 

° (2(l+P))^^^ " od+p)!'^'^ 


s= Pr 


[ui 2 > 2 c /{2 (1+P)}^'^^] / 



SI 


where the last equality follows from the distribution of 
derived at equation (2.2,3). Hence 

(2.5.3) Pr(wjL > c, w„ > c) < *• I (2li ,1), 

l-2cV(l+P) * ^ 

Table 2.5.1, gives the values of the bound given in 
equation (2.5.2)/ viz, [Pr(w^ > <^)1^ which does not depend on P 
as long as p <0. Tables 2.5*2# 2*5.3 and 2.5.4 give the exact 
values of M(c,c#p/p) for c = 0.10 (0,10) O.SO, p = 2(1) 10 
and p = -0,5/0 and 0.5 respectively. Corresponding values of 
the bound given in equation (2,5.3) are also shown in these 
tables. Prom these tables we observe that the bound given in 
equation (2.5.2) is better for small values of c compared to 
the other one# when P < 0, But when c is close to the value 
[(l+P)/2]^/^/ then the other bound is better. Also for P > O, 
the only bound for the bivariate probability is given by (2.5.3), 
Thus both bounds are useful depending upon the values of P ,c 
and p. But for practical purposes, it is better to calculate 
the exact bivariate probabilities by using the relations given 
in Theoran 2.4.3 or by using numerical integration* 







S38tC 
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T^-i.BLE 9,3,1, Nominal u'ooer critical values u of one- 

Q: 

sided tos i~ st atistic U for two outliers 
in linear regression , 

(a = 0 . 005 ) 


n 1 k 

1 

2 

3 

4 

5 

6 

7 

5 

.9911 

.9990 

1 .0000 





6 

.9783 

.9932 

.9993 

1 . 0000 




7 

.9636 

.9321 

.9946 

.9995 

1.0000 



S 

.9470 

.9676 

.9845 

.9955 

.9996 

1 .0000 


9 

.9 301 

.9513 

.9707 

.9864 

.9962 

.9997 

1.0000 

10 

.9133 

.9 345 

*9549 

.9732 

.9878 

.9968 

.9998 

11 

.89 70 

.9177 

.9 332 

.9578 

.9753 

.9890 

.9972 

12 

.8813 

.9012 

.9214 

.9414 

.9603 

.9770 

.9899 

14 

.3519 

.8701 

.8890 

.9083 

.9276 

.9466 

. 96-44 

15 

.8382 

.8556 

.8736 

.3922 

.9112 

.9 302 

,9483 

16 

.8251 

.8417 

.8589 

.8768 

.8952 

.9139 

.9 326 

18 

.8003 

.8158 

,8315 

.8479 

.8648 

.8824 

.9003 

20 

.7736 

-7923 

.8066 

.8215 

.8370 

.85 32 

.8699 

21 

.7682 

.7313 

,7949 

.8092 

.8240 

.8395 

.8555 

24 

.7396 

.7511 

.7631 

.7755 

.7885 

.8020 

, .8161 

25 

.7309 

.7419 

.7333 

.7653 

.7777 

.7907 

.8042 

27 

.7143 

.7245 

.7351 

.7460 

.7575 

.7694 

.7817 

28 

.7065 

.7163 

.7264 

.7370 

.7480 

,7594 

.7713 

30 

.6917 

.7008 

.7102 

.7200 

.7301 

.7406 

.7516 

32 

.6779 

.6864 

.6951 

.7042 

.7136 

.72 33 

.7335 

33 

.6714 

.6795 

.6880 

.6967 

.7058 

.7152 

.7249 

35 

.6589 

.6665 

.6744 

.6826 

.6910 

.6998 

.7083 

36 

.6529 

.6603 

.6680 

.6758 

.6840 

.6925 

.7012 

40 

.6308 

.6373 

.6441 

.6510 

.6582 

.6656 

.6733 

42 

.6206 

.6268 

.6332 

.6397 

.6465 

.6534 

.6606 

45 

. 6064 

.6121 

.6179 

.6240 

.6301 

.6365 

.6431 

48 

.59 32 

.5985 

.6039 

.6094 

.6151 

.6210 

.6270 

49 

.5890 

.5942 

.5994 

.6048 

.6104 

.6161 

.6220 

50 

,5849 

.5899 

.5951 

.6004 

.6058 

.6113 

.6170 

55 

-5658 

.5703 

.5749 

.5795 

.5843 

.5892 

.5943 

60 

.5487 

.5527 

.5568 

.5610 

.5653 

.5697 

.5741 

70 

.5191 

.5224 

.5258 

.5292 

.5327 

.5363 

.6399 

80 

.4943 

.4971 

. 4-999 

.5028 

.5057 

.5087 

.5117 

90 

.47 31 

.4755 

.4779 

.4803 

.4828 

.4854 

.4880 

100 

.4546 

.4567 

.4583 

.4610 

.4631 

.465 3 

.4676 
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TABLE 2»3«1 Contd, 

(a = 0.005) 


nlk S 9 10 11 12 13 14 


10 

1 .0000 

11 

.9998 

1? 

.9975 

14 

.9798 

15 

.9661 

16 

.9507 

IS 

.9185 

20 

.8871 

21 

.8721 

24 

.8 308 

25 

.8182 

27 

.7946 

28 

.7836 

30 

.7630 

32 

.7440 

33 

.7351 

35 

.718? 

36 

.7103 

40 

.6811 

42 

.6680 

45 

.6498 

48 

.6332 

49 

.6280 

50 

.6229 

55 

.5994 

60 

.5787 

70 

.5436 

80 

.5148 

90 

.4906 

100 

.4698 


1,0000 


.9998 

1,0000 

.9914 

.9980 

.9809 

.9920 

.9676 

.9819 

.9 366 

.9541 

.9047 

.9225 

.8892 

.9067 

.8460 

.3618 

.8328 

.8430 

.3081 

.8220 

.7965 

.8099 

.7748 

.7871 

.7549 

.7663 

.7456 

.7565 

.7279 

.7381 

.7196 

.7294 

.689 3 

.6978 

.6757 

.6836 

,6563 

.6640 

.639 6 

,6462 

.6342 

.6406 

.6289 

.6351 

.6047 

.6102 

.5834 

.5882 

.5474 

•5513 

.5180 

.5212 

.49 33 

.49 60 

.4721 

.4745 


.9999 

1.0000 

.9982 

.9999 

,9925 

.9983 

.9701 

.98 36 

.9400 

.95 69 

,9242 

.9415 

,8781 

.8948 

.8637 

.8799 

.8365 

.8516 

.8238 

.8382 

,7999 

.8133 

,7781 

.7903 

.7678 

.7796 

.7485 

.7594 

,7395 

.7500 

.7065 

.7155 

.6917 

.7002 

6714 

.6790 

.6529 

.6599 

6471 

.65 39 

6415 

.6481 

6157 

.6215 

5931 

.5982 

5552 

.5593 

5244 

.5278 

4987 

.5015 

4768 

.4793 


1.0000 


.9999 

1.0000 

.99 34 

,9986 

.9722 

.9850 

.9581 

.9731 

.9118 

.9288 

,8965 

.9133 

.8671 

.8831 

.85 32 

.8687 

.8271 

.8415 

.8031 

.8164 

.7919 

.8046 

.7708 

.7825 

.7608 

.7721 

.7249 

.7346 

,7089 

.7180 

,6869 

.6951 

.6671 

.6745 

.6609 

• 6681 

.6549 

.6618 

.6273 

.6334 

.6034 

.6087 

.5634 

.5676 

.5311 

.5346 

.5044 

.5073 

.4817 

.4842 


.9999 

.9941 

.9356 

.9455 

.9 301 
.8995 
.8846 
.3563 
.8301 

.8178 

.7947 

.7339 

.7447 

.7273 

.7035 

-6822 

.6755 

.6690 

.6396 

.6141 
.5719 
.5 381 
.5102 
.4867 
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TABLE 2.3.1 Contd . 


(a = 0.01) 


n 1 k 

1 

2 

3 

4 

5 . 

6 

7 

5 

.9859 

.9980 

1.0000 





6 

.9700 

.9893 

.9987 

1.0000 




7 

.9518 

.9747 

.9914 

,9990 

1.0000 



8 

.9 330 

.9571 

.9781 

.9929 

.9993 

1 .0000 


9 

.9144 

.9385 

.9613 

,9807 

.9940 

.9994 

1.0000 

10 

.8964 

,9198 

.9430 

.9646 

.9827 

.9948 

.9996 

11 

,8791 

.9016 

.9244 

.9467 

.9673 

.9844 

.9955 

12 

.8628 

.8841 

.9061 

.9283 

.9499 

.9697 

.9858 

14 

.8325 

-8517 

.8717 

.8924 

.9136 

.9347 

.9550 

15 

.8186 

.8367 

.8557 

.8755 

.8959 

.9167 

.9 374 

16 

.8054 

.8226 

.8406 

.8594 

.8789 

.8991 

.9195 

18 

.7809 

.7964 

.8126 

.8295 

.8473 

.8658 

.8849 

20 

.7588 

.7727 

.7873 

.8027 

.8187 

.8355 

.85 30 

21 

.7484 

,7617 

.7757 

.7902 

.8055 

.8215 

.8382 

24 

.7202 

.7317 

.7438 

.75 65 

.7697 

.7835 

.7979 

25 

.7115 

.7226 

.7342 

.7462 

.7589 

.7720 

.7858 

27 

.6952 

.7054 

.7160 

.7271 

.7386 

.7507 

.7633 

28 

.6875 

.6973 

.7075 

.7181 

.7292 

.7407 

.7528 

30 

,6730 

.6821 

.6915 

.70 i 3 

.7115 

.7221 

.7331 

32 

.6595 

.6679 

.6767 

,6857 

,6952 

.7049 

.7151 

33 

.65 31 

.6612 

.6697 

.6784 

.6875 

.6969 

.7067 

35 

.6409 

.6485 

.6563 

• 6645 

.6729 

.6816 

.6907 

36 

.6351 

.6424 

.6500 

.6579 

,6660 

.6744 

.6832 

40 

.6135 

.6200 

• 6266 

.6336 

.6407 

,6481 

.6557 

42 

• 6036 

.6097 

.6160 

.6225 

.6292 

.6361 

.6433 

45 

.5897 

.5953 

,6011 

.6071 

.6132 

.6195 

.6261 

48 

.5769 

.5821 

.5874 

.5929 

.5986 

,6044 

,6103 

49 

.5728 

.5779 

.5831 

.5884 

.59 39 

.5996 

.6054 

50 

.5688 

.5738 

.5789 

.5841 

.5894 

.5949 

.6006 

55 

.5503 

.5547 

.5592 

.5638 

.5685 

.5734 

.5784 

60 

.5337 

.5376 

.5416 

.5458 

.5500 

.5543 

.5587 

70 

.5049 

.5082 

.5115 

.5149 

.518 3 

-5218 

.5254 

80 

.4809 

,4836 

.4864 

.4892 

.4921 

.4950 

,4980 

90 

.4603 

.4627 

.4651 

.4675 

.4699 

.4724 

.4750 

00 

.4425 

.4445 

.4466 

.4487 

.4508 

.4530 

.4552 


55 


TABLE 2.3.1. Contd, 


(a = 0.01) 


8 9 10 11 12 13 14 15 


10 

1.0000 


11 

.9996 

1.0000 

12 

.9960 

.9997 

14 

.9733 

.9879 

15 

.95 71 

.9748 

16 

,9397 

.9590 

18 

.9045 

.9243 

20 

.8712 

.8900 

21 

.8555 

.8736 

24 

.3131 

.8283 

35 

.8002 

.8153 

27 

.7764 

,7901 

28 

.7653 

.7784 

30 

.7446 

.7566 

32 

.7257 

.7367 

33 

.7163 

.7274 

35 

.7001 

.7099 

36 

.6923 

.7017 

40 

.6635 

.6717 

42 

.6506 

.6582 

45 

.6328 

.6397 

48 

.6165 

.6228 

49 

.6114 

.6175 

50 

.6064 

.6124 

55 

.5835 

.5887 

60 

.5632 

.5679 

70 

.5291 

.5328 

80 

.5011 

.5042 

90 

.4775 

.4802 

100 

.4574 

.4597 


1.0000 



.9968 

.9998 

1.0000 

.9887 

.9971 

.9998 

.9761 

.9894 

,9973 

.9438 

.9622 

,9784 

.9092 

.9284 

.9473 

.8922 

.9112 

.9302 

.845 3 

.8624 

,8800 

.8310 

.8474 

.8644 

,8045 

.8194 

,8350 

.7921 

.8064 

.8213 

.7692 

.7822 

.7959 

.7482 

.7602 

.772 7 

.7384 

,7499 

.7619 

.7201 

.7307 

.7417 

.7115 

.7216 

.7322 

.6801 

,6889 

.6979 

.6661 

.6743 

.682 7 

.6468 

.6542 

.6618 

.6293 

,6361 

.6430 

.62 39 

.6304 

.6371 

.6185 

.6249 

.6314 

.5941 

,5996 

.6052 

.5726 

.5 775 

.5825 

.5366 

.5405 

.5445 

.5073 

.5105 

.5138 

.4828 

.4855 

,488 3 

.4619 

.4643 

,4666 


1.0000 


.9998 

1.0000 


.9907 

.9977 

.9999 

,9649 

.9802 

.9916 

.9488 

.9661 

.9809 

.8982 

.9166 

.9 349 

.8319 

,8999 

,9182 

.8512 

.8681 

,8854 

.8369 

.8530 

,8698 

.8101 

.8249 

.8403 

.7857 

.799 3 

.8134 

.7744 

..7874 

.8009 

.75 31 

.7651 

.7795 

.7432 

.7546 

.7665 

.7074 

.717 i 

.7273 

.6915 

.7005 

.7099 

,6697 

.6779 

.6863 

.6501 

.6575 

.6652 

.6440 

.6512 

.6586 

.6381 

.6451 

.6522 

.6111 

.6171 

.62 32 

.5876 

.5928 

,5982 

.5485 

.5527 

.5569 

.5171 

.5205 

.5240 

.4911 

.4939 

.4968 

.4690 

.4715 

.4739 
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TABLE 

3 «r 3 • X * 

Contd* 

(a = 

0.025) 




n 1 k 

1 

2 

3 

4 

5 

6 

7 

5 

►9740 

.9950 

1.000 





6 

.9525 

.9802 

.9967 

1 .0000 




7 

.9 302 

.9599 

.9842 

.9976 

1.0000 



8 

.9085 

.9 379 

.9653 

.9870 

.9982 

1,0000 


9 

.8879 

.9160 

,9439 

.9694 

.9890 

.9986 

1 . 0000 

10 

.8686 

.8951 

.9222 

.9487 

.9727 

.9905 

.9989 

11 

.8504 

.8752 

.9011 

.8273 

.9527 

.9753 

.9917 

12 

.8335 

.8567 

.8810 

.9062 

.9317 

.9561 

.9774 

14 

•8026 

.8230 

.8444 

.8670 

.8905 

.9147 

.9387 

15 

.7886 

.8077 

.8278 

.8491 

.8713 

.8945 

.9182 

16 

• 7754 

.7933 

.8122 

.8322 

.8533 

.8753 

.8981 

18 

.7512 

.7670 

.7838 

.8015 

.8202 

.8 399 

.8606 

20 

.7294 

.7436 

.7586 

.7744 

.7910 

.8086 

.8271 

21 

.7193 

.7328 

.7470 

.7619 

.7777 

.7943 

.8117 

24 

.6917 

.7034 

.7156 

.7284 

.7418 

.7559 

.7708 

25 

.6833 

.6944 

.7061 

.7183 

.7311 

.7445 

.7586 

27 

.6675 

.6777 

.6884 

.6995 

.7111 

.7233 

.7361 

28 

.6601 

.6699 

.6801 

.6907 

.7018 

.7135 

.7257 

30 

.6461 

.6551 

.6645 

.6743 

.6845 

.6951 

.7062 

32 

.6331 

.6414 

.6501 

.6592 

.6685 

.6783 

.6885 

33 

.6269 

.6350 

.6433 

.6520 

.6611 

.6705 

.6802 

35 

.6152 

.6227 

.6305 

.6385 

.6469 

.6556 

.6646 

36 

.6096 

.6168 

.6244 

.6321 

.6402 

,6486 

.6573 

40 

.5889 

.5953 

.6019 

.6087 

.6157 

.6230 

.6305 

42 r 

.5794 

.5854 

.5916 

.5980 

.6046 

.6115 

.6185 

45 

.5661 

.5717 

.5773 

.5832 

.5892 

.5955 

.6019 

48 

.5539 

.5590 

.5642 

.5696 

.5751 

.5808 

.5867 

49 

.5500 

.5549 

.5600 

.5653 

.5707 

.5762 

.5820 

50 

.5462 

.5510 

.5560 

.5611 

.5664 

.5718 

.5773 

55 

.5235 

.5328 

.5372 

.5417 

.5463 

.5511 

.5559 

60 

.5126 

.5164 

.5204 

.5244 

.5285 

.5328 

.5371 

70 

.4852 

.4884 

.4916 

.4949 

.4982 

.5017 

.5052 

80 

.4623 

.4649 

.4677 

.4704 

.4732 

.4761 

.4790 

90 

.4427 

.4450 

.4473 

.4496 

.4520 

.4545 

.4569 

100 

.4257 

.4276 

.4297 

.4317 

.4338 

.4359 

.4380 
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TABLE 2.3.1. 

Contd. 

(a = 

0.025) 





nlk 

8 

9 

10 

11 

12 

13 

14 

15 

10 

1.0000 








11 

.9991 

1.0000 







12 

.9926 

.9992 

1.0000 






14 

.9614 

.9808 

.9941 

.9995 

1.0000 




15 

.9416 

.9636 

.9821 

.9946 

.9995 

1.0000 



16 

.9213 

.9442 

.9655 

.9833 

.9951 

.9996 

1 .0000 


IS 

.8821 

.9043 

.9267 

.9486 

.9687 

.9852 

.9958 

.9997 

20 

,8465 

.8668 

.8879 

.9095 

.9313 

.9522 

.9713 

.9867 

21 

.8301 

.8494 

.8695 

.8904 

.9il8 

.9332 

.9538 

.9725 

24 

.7864 

.8028 

.8291 

.8382 

.8571 

.8768 

.8 971 

.9177 

25 

.7734 

.7890 

.8054 

.8226 

.8406 

.8594 

.8789 

.8991 

27 

.7495 

.7635 

.7783 

.7938 

.8100 

.82 71 

.8449 

.8635 

28 

.7384 

.7518 

.7658 

.7805 

.79 60 

.8122 

.8292 

.8470 

30 

,7179 

.7300 

.7428 

.7561 

.7701 

.7848 

.8001 

.8162 

32 

.6992 

.7103 

.7219 

.7340 

.7467 

.7601 

.7740 

.7886 

33 

.6904 

.7011 

.7122 

.7238 

.7359 

.7486 

.7619 

.7758 

35 

.6741 

.6838 

.6940 

.7047 

.7158 

.7274 

.7395 

.7522 

36 

.6664 

.6758 

.6855 

.6957 

.7064 

.7175 

.7291 

.7412 

40 

.6384 

.6465 

.6549 

.6636 

.6726 

.6820 

.6918 

.7020 

42 

.6258 

.6334 

.6412 

.6493 

.6577 

.6664 

.6755 

.6849 

45 

.6085 

.6154 

.6224 

.6297 

.6373 

.6451 

.65 32 

.6617 

48 

.5928 

.5990 

.6055 

.6121 

.6189 

.6260 

.6334 

.6409 

49 

.5378 

.59 39 

.6001 

.6066 

.61 32 

.6201 

.6272 

,6345 

50 

.5830 

.5889 

.5950 

.6012 

.6077 

.6143 

.6212 

.6283 

55 

.5609 

.5661 

.5714 

.5768 

.5824 

.5881 

.5940 

.6001 

60 

.5415 

.5460 

.5507 

.5555 

.5604 

.5654 

.5705 

.5758 

70 

.5087 

.5124 

.5161 

.5199 

.52 38 

.5278 

.5318 

.5 360 

80 

.4819 

.4849 

.4880 

.4911 

.4943 

.4976 

.5009 

.5043 

90 

.4594 

.4620 

.4646 

.4672 

.4699 

.4726 

.4754 

.4782 

100 

.4402 

.4424 

.4446 

.4469 

.4491 

.4515 

.45 39 

.4563 
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TABLE 2.3.1. Contd. 

(a = 0.05) 



m k 

1 

5 

.9587 

6 

.9326 

7 

.9074 

8 

.8840 

9 

.8623 

10 

.8424 

11 

.82 39 

12 

.8069 

14 

.7762 

15 

.7623 

16 

.749 3 

18 

.7256 

20 

.7044 

21 

.6946 

24 

.6679 

25 

.659 7 

27 

.6445 

28 

.6374 

30 

.62 39 

32 

.6113 

33 

,6054 

35 

.5941 

36 

.5887 

40 

,5688 

42 

.5597 

45 

.5470 

48 

.5352 

49 

.5315 

50 

.5279 

55 

.5109 

60 

.4957 

70 

,4694 

80 

.4474 

90 

.4287 

100 

.4123 


2 3 


.9900 ,9999 

,9635 .9933 

.9431 .9749 

.9177 .9503 

.8936 .9257 

,871^ .9014 

.8505 .8786 

.8314 .8575 

.7972 .8197 

.7819 .8029 

.7677 .7872 

.7417 .7588 

.7187 .7339 

.708i .7225 

.6795 . 69 1 7 

.6708 .6825 

.6547 .6653 

.6471 .6572 

.6328 .6421 

.6196 .6282 

.6134 .6216 

,60i5 .6092 

.5959 .6033 

.5751 .5816 

.5657 . 57 I 8 

.5524 .5580 

.5403 .5454 

.5 364 .5414 

.5326 .5375 

,5151 .5194 

.4994 .5033 

.4725 .4757 

.4500 .4527 

.4309 .4331 

.4143 .4162 


.9321 .9613 

.9079 .9374 

.8850 .9135 

.8436 ,8689 

,8251 .8488 


.6303 .6392 

.6172 .6255 

.6110 .6190 

.5884 .595 3 

.5781 .5846 

,5638 .5 697 

.5507 .5562 

.5466 .5519 

,5425 .5477 

.5239 .5284 


.9849 .99 78 

,9650 .9868 

.9419 .9680 

.8954 .9225 

,8737 .8997 


.6486 .6583 

.6341 .6431 

.6273 .6360 

.6025 .6100 

,5913 .5983 

.5759 .5822 

,5618 .5676 

«S573 .5629 

.5530 .5585 

.5331 ,5378 


.5073 .5113 
.4789 .4822 
.4554 .4581 
.4354 .4378 
.4182 .4202 


.5154 .5197 
.4855 .4889 
.4609 .4637 
.4401 .4426 
.4223 .4244 


.8080 .8 300 
.7770 .7964 
,7500 .7670 
.7376 .7537 
.7046 .7182 

.6947 .7076 
.6764 .6880 
.6678 .6789 
.6518 .6620 
.6372 ,6465 


.8534 .8780 

.8169 .8386 

.7850 .8042 

.7706 .7886 

.7324 .7475 

.7211 .7354 

.7003 .7131 

.6906 .7028 

.6726 .68 38 

.6562 ,6664 
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TABLE 2.3.1. 

Contd* 

(a 

= 0.05) 





n| k 8 

9 

10 

11 

12 

13 

14 

15 


10 

1.0000 








11 

.9982 

1.0000 







12 

.9883 

.9985 

1.0000 






14 

.9490 

.9728 

.9906 

.9989 

1.0000 




15 

.9261 

.9513 

.9747 

.9914 

.9990 

1.0000 



16 

.9036 

.9294 

,9544 

.9763 

.9922 

.9992 

1,0000 


IS 

.8615 

.8855 

.9102 

.9350 

.9586 

.9791 

.9933 

,9993 

20 

.8244 

.8458 

.8683 

.8917 

.9157 

,9396 

.9621 

,9812 

21 

.8077 

.8278 

.8491 

.8713 

,8945 

.9182 

.9416 

.9636 

24 

.7635 

.7803 

.7981 

.8169 

.8367 

.8575 

.8792 

.9017 

25 

.7505 

.7664 

.78 31 

.8009 

,8196 

.8393 

,8600 

.8816 

27 

.72 66 

.7409 

.7559 

.7717 

.7884 

,8060 

.8246 

.3441 

23 

.7157 

.7292 

.7434 

.7584 

.7742 

.7909 

.8084 

.82 69 

30 

.6954 

.7076 

.7205 

.7340 

.7481 

.7631 

.7788 

.7954 

32 

.6770 

.6882 

.6998 

.7120 

.7248 

,7383 

.7525 

.7674 

33 

.6685 

.6791 

.6902 

.7019 

.7141 

.7269 

.7404 

,7545 

35 

.6525 

.6622 

.6724 

.6830 

,6942 

.7058 

.7180 

.7308 

36 

.6449 

.6543 

.6641 

.6743 

.6849 

,6960 

,7077 

.7199 

40 

.6177 

.6257 

.6341 

.6427 

.6517 

.6611 

.6709 

,6811 

42 

.6055 

.6130 

.6207 

.6288 

.6371 

.6458 

.6548 

.6642 

45 

.5887 

.5955 

.6025 

,6097 

.6172 

.6250 

,6330 

.6414 

48 

*5735 

.5797 

.5860 

.5926 

.599 3 

.6063 

.6136 

.6211 

49 

.5687 

.5747 

.5809 

.5872 

.59 38 

.6006 

.6076 

.6148 

50 

.5641 

.5699 

.5759 

.5820 

•5884 

.5950 

.6017 

.6088 

55 

.5428 

.5478 

.5530 

.5583 

.5638 

.5695 

.5753 

.5813 

60 

.5240 

.5285 

.5330 

.5377 

•5425 

.5475 

.5525 

,55 78 

70 

.4924 

.4960 

.4996 

.5034 

.5072 

.5111 

.5151 

.519i 

80 

.4666 

.4696 

.4726 

.4756 

.4788 

.4819 

.4852 

.4885 

90 

.4450 

.4475 

.4500 

.4526 

• 4552 

.4579 

.4606 

.4634 

LOO 

.4265 

.4286 

.4308 

.4330 

.435 3 

.4375 

.4398 

.4422 
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TABLE 2.3.1. Contd. 


(a = 0.10) 


1 2 3 4 5 6 7 


5 

.9343 

.9800 

.9995 

6 

.9042 

.9500 

,9867 

7 

.8770 

.9192 

.9601 

8 

.8526 

.8907 

.9 302 

9 

.8 306 

.8649 

.9014 

10 

.8106 

.8416 

.8749 

11 

.7924 

.8205 

.8508 

12 

.7756 

.8012 

.8289 

14 

.7457 

.7674 

.7907 

15 

.7323 

.7524 

,1139 

16 

.7198 

.7384 

.7584 

18 

.6970 

.7132 

,7306 

20 

.6766 

.6910 

.7063 

21 

.6672 

.6808 

.6952 

24 

.6417 

.6533 

.6655 

25 

.6340 

.6450 

,65 66 

27 

.6195 

.6295 

,6400 

28 

.6126 

.6222 

.6323 

30 

.5998 

.6086 

.6178 

32 

.5879 

.5960 

.6045 

33 

.5822 

.5900 

.5982 

35 

.5715 

.5788 

.5863 

36 

.5664 

.5734 

.5807 

40 

.5474 

.5536 

.5600 

42 

.5 338 

.5446 

.5506 

45 

.5267 

.5320 

.5375 

48 

.5155 

.5204 

.5254 

49 

.5119 

.5167 

.5216 

50 

.5085 

.5131 

.5179 

55 

.4923 

.4964 

.5096 

60 

.4778 

.4815 

.485 3 

70 

.4528 

,4558 

.4589 

80 

.4319 

,4344 

,4370 

90 

.4140 

.4161 

,4183 

100 

.3984 

.4003 

.4022 


.9998 

.9905 

.9671 

.9999 

.9929 

.9999 


.9 385 

.9722 

.9944 

1 .0000 

.9100 

.9451 

.9760 

.9956 

.8833 

,9171 

.9504 

.9790 

.8587 

.8904 

.9230 

.9547 

.8159 

.8428 

.8716 

.9018 

.7971 

.8220 

.8486 

.8769 

.7798 

.8029 

,8275 

.85 39 

.7491 

.7690 

.7902 

.8129 

.7225 

.7398 

.7582 

.7779 

.7105 

.7267 

,7439 

.7623 

.6784 

.6919 

.7063 

.7215 

,6688 

.6817 

.6952 

.7096 

.6511 

.6627 

.6749 

.6878 

.6428 

.6539 

.6655 

.6777 

.6275 

.6375 

.6481 

.6592 

,6133 

.6226 

.6322 

.6423 

,6067 

.6156 

,6248 

.6345 

.5942 

.6024 

.6109 

.6198 

.5883 

.5962 

,6044 

.6129 

.5666 

.5734 

,5805 

.5878 

.5568 

.5632" 

.5693 

.5766 

.5431 

.5489 

.5550 

.5612 

.5306 

.5 359 

.5414 

.5471 

.5266 

.5 318 

.5 372 

.5427 

.5228 

.5279 

.5 331 

.5381 

,5050 

.5094 

.5140 

.5186 

.4891 

.49 31 

.4971 

.5013 

.4621 

.4653 

.4685 

.4719 

,4396 

.442 3 

.4450 

.4478 

.4206 

.4228 

.4252 

.4275 

,4041 

.4061 

.4081 

.4101 
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TABLE 2,3.1» Contd, 


(a = 0.10) 


n| k 

3 

9 

10 

11 

12 

nmn 

14 

15 

10 

1.0000 








11 

,9964 

1.0000 







12 

.9814 

.9970 

1.0000 






14 

.9 324 

.9 615 

.9850 

.9978 

1 .0000 




15 

.9065 

.9362 

,9642 

.9864 

.9981 

1 ,0000 



16 

.8317 

.9106 

.9396 

.9665 

.9875 

.9983 

1 .0000 


18 

.8 371 

.8629 

.8899 

.9177 

.945 3 

.9703 

.9894 

.9987 

20 

.7989 

.8214 

.8452 

.8704 

.8967 

.9236 

.9499 

.9734 

21 

.7819 

.8029 

,8251 

.8488 

.8737 

.8997 

.9261 

.9518 

24 

.7377 

.7549 

.7731 

.7925 

.8131 

.8350 

.8581 

.8824 

25 

.7248 

.7410 

.7581 

.7763 

.7956 

,8162 

.8379 

.8609 

27 

.7013 

.7157 

,7309 

.7470 

.7640 

.7821 

.8013 

.8217 

28 

.6906 

.7042 

.7185 

.7337 

.7498 

,7668 

.7848 

,8040 

30 

.6708 

.68 30 

.6959 

.7095 

.7238 

.7389 

.7549 

.7719 

32 

.6529 

.6640 

.6756 

.6879 

.7007 

.7143 

.7286 

.7437 

33 

.6446 

.6552 

• 666 3 

.6779 

.6901 

.7030 

.71 66 

.7309 

35 

.6291 

.6387 

.6489 

,6594 

.6705 

.6822 

.6944 

.7073 

36 

.6218 

.6311 

.6407 

,6509 

.6615 

.6726 

.6842 

.6965 

40 

.5955 

.6034 

.6116 

.6202 

.6291 

,6384 

.6481 

.6583 

42 

.5837 

.5911 

.5987 

.6067 

.6149 

.6235 

.6325 

.6418 

45 

.5676 

.5742 

.58il 

,5883 

.5956 

.6033 

,6113 

.6195 

48 

.55 30 

.5590 

.5652 

.5717 

.5784 

.5853 

.5924 

.5998 

49 

.5484 

.5542 

.5603 

.5665 

.5730 

.5797 

.5866 

.5937 

50 

.5439 

.5496 

.5555 

.5615 

.5678 

.5743 

.5809 

.5879 

55 

.52 35 

.5284 

.5335 

.5337 

.5441 

.5497 

,5554 

.5613 

60 

.5055 

.5099 

.5143 

.5189 

.5236 

.5285 

.5 334 

.5 385 

70 

.4753 

.4787 

.482 3 

.4859 

.4897 

.4935 

.4974 

.5014 

80 

.4506 

.45 35 

.4564 

.4594 

.4624 

.4655 

.4687 

.4719 

90 

.4299 

.4323 

.4348 

.4373 

.4398 

.4424 

.4451 

.4478 

100 

.4122 

.4143 

.4164 

.4185 

.4207 

.4229 

•4252 

.4275 
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TABLE 2 


TABLE 2 


3.2. Nominal upoer critical values for a = 0.02625 

and selected values of n and k« 


n 

3c 

a 

10 

1 

0.8669 

11 

1 

0,8487 

20 

8 

0,8450 

21 

1 

0.7176 

48 

13 

0.6247 


3.3. Nominal upper critical values v^,, for a = 0.02 625 
and selected values of n and Ic. 





20 

1 

0.7504 

20 

8 

0.8643 


48 


13 


0.6432 
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TABLE 

2.5.2. 

Exact value 

of MCc/C^P^p) in top 

row and its 

bound given 

at equation 

(2.5.3) 

in bottom row 

for P 3 -0.5 

• 



p 1 c 

0.10 

0.20 

0.30 

0.40 

0.50 

2 

0.13478 

0.10257 

0,06968 

0.03568 

0.00000 


0.43591 

0.36901 

0.29517 

0.20483 

0,00000 

3 

0,11944 

0.0779 3 

0.04274 

0,01527 

0,00000 


0,40000 

0.30000 

0.20000 

0.10000 

0,00000 

4 

0.10862 

0.06225 

0.02821 

0.00720 

0.00000 


0.37353 

0.25232 

0.14238 

0.05204 

0.00000 

5 

0.10015 

0.05107 

0.01938 

0.00357 

0.00000 


0.35200 

0.21600 

0.10400 

0.02800 

0.00000 

6 

0.09 312 

0.04262 

0,01363 

0,00183 

0.00000 


0.33361 

0,13697 

0.07719 

0.01537 

0,00000 

7 

0.08712 

0.03599 

0.00976 

0.00096 

0.00000 


0.31744 

0.16308 

0.05792 

0,00856 

0.00000 

8 

0.08186 

0.03065 

0.00708 

0.00051 

0.00000 


0.30295 

0.14305 

0.04381 

0.00481 

0.00000 

9 

0.07720 

0.02629 

0.00519 

0.00028 

0.00000 


0.28979 

0.12604 

0.03334 

0,00273 

0.00000 

10 

0.07301 

0.02268 

0.00383 

0.00015 

0,00000 


0.27772 

0.11143 

0.02550 

0.00156 

0.00000 
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TABLE 2.5.4. Exact valu e of M(c,c,p,p) in to p row and its bound given at 
equation (2.5.3) in bottom — 
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CHAPTER III 


TWO OUTLIERS IN A RANDOM SAMPLE FROM 


3.1 « Introduction 


Let Yi'y2'***^^n ^ ind^endently and normally 

distributed observations which constitute a random sample of 

2 

size n from a N()Li,a ) distribution. In this chapter/ we shall 
denote the ith order statistic by * where —^(2) — ***— ^(n) 

are obtained by rearranging Y\»Y2f • •• ’Y-^ in an increasing order 
of magnitude. Now, the residual vector is e = ^ ®l '®2 ' * * '^^n^ ^ * 
where 


= y . - 


y 


__ n 

y = S Yi/n 

i=l 


is the sample mean, and the ©jnwjr sum of squares 
is based on n-1 degrees of freedom. 


n „ n 
= 2 e^= 2 
i=l ^ i=l 




The elements of variance covariance matrix A are given 


by 

( 3.1.1) 



— 1/n if i ^ j 
(n-l)/n if i = j/ 


and the correlation coefficient is = ■"l/(n— 1) = P , 

Denote the common variance of residuals by A = (n-l)/iu Hence 
the studentized residual as defined in (1,2 *5) is given by 


W4 


ei/(Sp X 


1/2 


)/ is=l/2/ 


• • • / 


n. 


(3.1.2) 
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where = E is the pooled sum of 

squares based on p = n-l+v degrees of freedom. 

Equation (2.1.1) now reduces to 
Uij = (ej^+ej)/rSp { 2X(1+P)| 

= [n/{2(n-2)l [(yi+yj- 2 y)/Sp] . 

This gives 

” ' l<i?J<n ° L<V'!2Cn-2)}]l/2 [‘yCn) ] 

= [ n/{2(n-2)} 

where M = ^y(n)+y(n„jL)'“2y)/Sp is the Murphy's statistic for two 
outliers. This has been studied in detail by Hawkins (1978). 

He also provides exact upper lOOa percent of M for n = 5(1)15(5) 30^ 
a = 0.001/0,01,0.05 and 0,1> and v - 0,5/15 and 30, Power 
studies for M has also been done by him. Note that for = O, 
Murphy's statistic is commonly defined by (Barnett and Lewis, 

1978) 

^n 3 = ^y(n) + y(n-i) ^ i 

where s^ = S^/(n-l), Consequently for = 0, 

(3.1.3) U = rn/{2(n-l) (n-2)}]^'^^ . 1^3. : 

Simulated upper percentage points of 1^3 are tabulated 
by Barnett and Lewis for n = 5(1)10(2) 20(10) 50, 100, and 
a = 0,01 and 0 . 05 , 
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Similarly, for the two sided statistic V, we have 
Vij = and 


V = Max 1 V . , I 

l<i<j<n 




= ty(n)~y(i)5/[2{(n-l)s^+ . 


For V = Oj V reduces to 


(3.1.4) V = [l/f2(n-l)}^/2] . Tj^g / 

where = ^^(n) "* internally studentized range 

statistic. 


This statistic ^6 has been used as the discordancy test 

for a lower and upper outlier-pair y(i)/y(n) in a normal sample 
2 

with fj, and a unknown, by David, Hartley and Pearson (1954), 
Pearson and Stephens (1964), Shapiro, Wilk and Chen (1968), 
Barnett and Lewis (1978) etc. 

The above calculations show that our statistics U and V 
reduce to the widely used statistics M and T^g respectively for 
V = O. 


3.2. Distribution theory 

In this case the marginal and joint distributions are 
immediately obtained from the general results derived in Section 

2.2, with k = 1 and = p = ~l/(n-l) for all i j. 

The marginal distribution of u^^ is given at (2.2»ll) 

. The joint density of u^^j and given 


with p = n-l+v 
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at (2.2.9) reduces to 


(3.2.1) ,, , [1 - -i_ („2 




- 2P^n u . +u| . )](n-5+y)/2 
i 13 


and the function is defined over 

u^ . - 2P., u.. u. . + u5 . < 1 - P? / 

where the shape parameter from e( 3 uation (2.2.8) is 


= 


Pji^ Pjj^ ^ 

[ 1/2 • 


2[(1 + P..)(1 + P.^j^)3 

In particular when i=l, j =2/ =1, =3/ that is 

for the joint distribution of u ^2 ^13' have 

(3.2.2) = (1+3P)/£2(1 + P)] = ( n-4)/[ 2 (n-2 ) ] 


Sim.ilarly, for the joint distribution of (Uj^ 2 »^ 34 ^ 'ws 


have 


(3.2.3) = 4P/[2(1 + P)] = -2/(n-2) . 

Let N = n(n~l)/2y which is the total n^lmber of distinct 
u . /s used in determing the statistic U, The total ntimber of 

X j 

distinct combinations of two u. .'s are given by 

Xj 


( 3.2.4) 


N(N-l) n(n-l) (n-2) (n+1) 

^2’ ~ 2 


8 


Out of these pairs, for u. . and u. . , there are 4 

ij 


different types of combinations, which are given below ; 



71 


(i) i = j like 

(ii) i i4 Lj^f 3 = j^r li>ce ^2 3 

(iii) i ij^, j = 3 ^ lil<e 

(±v) ± ^ /: 3 ^ 3 ^ ^ like ^ 34 * 

These types along with their shape parameter values and 
the number of combinations are given in Table 3*2 

TABLE 3,2.1. Types of combinations of two u^/s with 
corres-pondinc shape parameter values. 


Type 

No, of combinations 

Shape parameter 


n(n-l) (n~ 2)/6 

1 1 

CM 

1 

a 

CM 

1 1 

\ 

h 


n(n~l) (n- 2)/6 

(n-4)/[2(n-2)] 

^12'^2 3 

n(n-l) (n- 2)/6 

(n-4)/[2(n-2)] 

^ 12 '^34 

n(n-l) (n-2) (n-3)/8 

- 2 / (n- 2 ) 


From Table 3,2,1 it is clear that essentially there are only 
2 types of pairs for u^j's having shape parameter values 
(n-4)/[2(n-2) j and -2/(n-2)* In fact three types of pairs 
give same values of shape parameter and we can use condensed 
Table 3,2.2, Such condensed tables are useful for the general 
case/ where' it is not possible to count the nvimber of pairs 
directly. 

Similarly/ we can obtain shape parameters for v. .'s also. 

Ij 

These are given in Table 3,2.3, The condensed table is given in 
Table 3,2*4, 
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TABLE 3,2.2. 

Condensed table for the combinations of two u, .'s 

- - - - . - .. ' 11 - 

with correstjondina shape parameter values. 



Serial No. 
i 

Shape parameter P^ 

Frequency 

1 

P^ = (n-4)/[2(n-2) ] 

n(n-l) (n-2)/2 

2 

p 2 = -2/(n-2) 

n(n-l) (n-2) (n-3)/8 

TABLE 3 ; 2. 3, 

Types of combinations of two with 


correspondinq shape parameter values. 




Type 

No. of combinations 

Shape parameter 

"^1 2 ' ^1 3 

n(n-l) (n-2)/6 

1/2 

^13^^2 3 

n(n-l) (n-2)/6 

1/2 

^12''^2 3 

n(n-l) (n-2)/6 

-1/2 

^12^^34 

n(n-l)(n-2) (n-3)/8 

0 

TABLE 3.2.4. 

Condensed table for the combinations of two 


with correspondinq shape parameter values. 


Serial No. Shape parameter Fre<guency 

1 Pj = 1/2 

2 P' = -1/2 
P3 = o 


3 


n(n-l) (n-2)/3 
n{n-l) (n-2)/6 
n(n-l) (n-2) (n-3)/8 
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3»3» Comparison of percentile points with taliulated values 

The upper and lower limits for the true percentage points 
can be obtained by using Bonferroni inequalities. 

Nominal upper percentile points of U and V are given by 
the equations (2.3,1) and (2.3,2) respectively. These are 
given in Table 2.3,1 for ^ = 0 and different values of n and a. 

For a lower bound of the true percentile point, we use 
equation (2,3,3) etc. Thus for the statistic U we need to 
solve the second Bonferroni inequality 


(3.3.1) (?) Pr (u .. > u) - SE Pr(u . . > 

^ 1 J Ij 


u 




> -u) 


a. 


where the double sum is over all distinct termsj,which appear 
in the second term of Bonferroni inequality. Clearly, if all 
the bivariate pixibabilities appearing in equation (3,3,1) are 
zero, then u~ = U„(e), the exact critical point. From equation 
(2.4,12) we see that a sufficient condition for the bivariate 
probability M(h,h,P^,p) to be equal to zero is that 

Pi < 2h^ - 1, 

or equivalently 

h > [(1 +Pi)/2]^/2. 

McMillan (1971) has shown that this condition holds for 
statistic M if > [ ( 3n-8)/(2n) where = [2(n-2)/n]^^^UQ,, 

that is if u^ > [( 3n-8)/{4(n-2) i For V = O, this is 

satisfied for n < 10 when a = 0,05 and n £ 13, T^hen a = 0,01, 
For slightly larger n he states that the error is negligible, vie 
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now obtain -the same result by considering the joint distribution s® 
From Table 3,2»2/ we see that the shape parameters are 

= (n-4)/[2(n-2) ] and P 2 = -2/{n-2)^ hence, the bivariate 
probabilities appearing in equation (3. 3.1) are zero, only 

if u^ > max [ ( (1+Pj_) { (l+P2)/2 . Further p^ < P^, 
and hence we get the exact critical values if 

> [(1+P3^)/2] 

Substituting for we get the condition as 

> [ (3n-8)/{4(n-2) (say). 

Table 3.3.1 gives the values of a^^ for n = 5(1) 15. 

TABLE 3.3.1. Value of a^^ = [ ( 3n-8)/{ 4(n-2) for 

determining the exact critical values . 


n 

^n 

5 

0.76376 

6 

0.79057 

7 

0.80623 

8 

0.81650 

9 

0.82375 

10 

0.82916 

11 

0.83333 

12 

0.83666 

13 

0.8 39 37 

14 

0.84163 

15 

0.84353 
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The nominal critical points for n,a.f k = 1 and v = O are 
already tabulated in Table 2.3*1, However/ for comparison 
purposes, we again tabulate them for a = 0,01 , 0,05; ^ = 0,5; 
k = 1, and n = 5(1) 15 (5) 60 in Table 3,3,2* Comparing from 
Table 3,3.1, we see that for v = o the critical point u is 
exact for n not exceeding 10, 13 and 14 when a = 0,05, 0,01 and 
0,005 respectively. Similarly for V = 5, the values are exact 
for n up to 5 and 7 when a = 0,05 and 0,01 respectively. 

For comparison of other values, we evaluate 
= [ 2(n-2)/n]^‘^^ u^^. These are also given in Table 3,3,2, 
Exact critical values of Murphy's statistic M are tabulated by 
Hawkins (1978), Exact percentile points of M given by Hawkins 
for V = 0, a = 0,05 and n = 10, 20 and 30 are 1,066, 0,944 and 
0,848 respectively. From Table 3,3.2, the corresponding values 
of Mq. obtained through u^^ are 1,06 6 , 0,945 and 0,852 
respectively. Similarly for V = S and a = 0,05 the critical 
values given by Hawkins for n = 10, 20 and 30 are 0,918 (0,917), 
0,860 (0,863) and 0,793 (0,798y respectively, vriiere the values 
shown within brackets are taken from Table 3.3,2, In general 
our values agree upto two decimal places in all the cases and 
upto three decimal places for small values of n. 

In Table 3,3,3 we have obtained the nominal critical 
values of Tj^^ through u^. For comparison purposes we have also 
tabulated simulated values of Tj^^ as given by Barnett and Lewis 
(1978) for n = 10, 20, 50 and 100 when a = 0.01 and 0.05. 
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Here again we find that our values closely agree with the 
simulated values. There is some deviation for large values of 
n, which is due to the fact that our values are evaluated with 
the assumption of zero bivariate probabilities. This is not 
true when n is large. 

Similarly the internally studentized range statistic 
Tjjg is related to V as mentioned in (3,1,4). Since IP^i = 1/2 
is the maximum of all IP^I 's given in Table 3,2,4, hence on 
applying condition given at equation (2, 4, 13), we see that the 
critical values v^ are exact if 

Vq, > I il+Pp = 0,3660., 

Nominal critical points v^^ for n, a, k = 1 and y = 0 can 
be obtained from Table 2,3,1, Again for comparison purposes, 
we tabulate thon for a = 0,01, 0,05 f V = 0, k = 1 and n = 3(1) 
10(1) 20(10) 100 along with the corresponding values [ 2 ( n-1 )’J ^^^v^ 
of Tj^g obtained through -v^ in Table 3,3,4. From these tables 
it follows that v^^ is exact for n < 10 when a = 0,05 and for 
n < 13 when a = 0,01, 

The critical values of Tj^g for n = 5, 16, 20 and 60 given 
by Barnett and Lewis (l978) with our values within brackets are 
2.75(2.755), 4.24(4,247), 4.49(4,496) and 5,51(5.568) ^en 
a = 0.05 and 2.80(2,803), 4.52(4,519), 4.80(4.800) and 
5,94(5,960) vjhen a = 0,01 respectively. Here we find that 
nominal critical points of agree almost completely with the 
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exact critical values for snail values of n, and there is a very 
small deviation for large n. 


3.4, Lower bound for type I error probability 

The number of combinations of u. .'s or v^ .'s obtained in 

41 11 

Section 3.2 is used for obtaining lower bounds for type I error 
probabilities of U and V statistics in sub-sections 3.4.1 and 
3,4.2 respectively. 


3.4.1. Lower bound for type I error probability of U 


Using the notations introduced in Section 2.4, we see that 
the probability 


Pr(u. . > h, u . . > h)p. 


P/p) ~ M(h/h,p,p} . 


For rotational convenience denote the with single 

subscript (i = 1#2,.../N), i^^ere N = n(n-l)/2. 

Using Bonferroni inequalities, we have 


(3.4.1) S 3 _-S 2 < Pr(U > Uc^) < 

N 

where S- = E Pr(uj > u^) and So - ^ S PrCu- > u^, u. > u„) . 

i=l 4 ^ ^ l<i<jlN 4 a j a 

Note that S^ s= a, while from Table 3.2.2, we have 
S^ = [n(n-l)(n-2)/2] Pr(u^ > u^, u^ > Uj^fPj^,p) 

+ [ n(n-l) (n-2)(n-3)/8] Pr (u^ > u^., Uj > u^lPg/p)/ 
where p^ and p^ are as given in Table 3,2 .2. 
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Hence 

= [ ^(^”lHn-2)/8] [4 M(Uq.,u^, (n-4)/{2(n-2) i ,p) 

+ (n-3) M(u^,u^,-2/(n-2) /p) ] . 

The bivariate probabilities can be calculated using the 
method described in Section 2.4. Using this value of in 
the expression (3.4*l)y we get a lower bound for Pr(U > • 

These lower bounds for the type I error probability of U are 
shown in Table 3.4.1 for a = 0,01, 0.05? v = 0, 55 and 
n = 5(1) 15(5) 50. This table also shows that nominal upper 
percentage points are quite close to the true upper points for 
n < 30, The lower limit decreases very rapidly as n increases. 

Such a rapid decrease indirectly shows that equation 
(2*3.3) will not have a solution for large values of n. This 
is due to the fact that appearing in equation (3,4,1) 

becomes negative for large n. Consequently, we do not have a 
systematic procedure for determining a lower limit u^^ for the 
exact critical value U^(e) in such cases. In Section 3,5# we 
describe a method for obtaining approximate critical values of 
Murphy's test statistic. The approximation is rsnarkably good, 
and there is no need to calculate the lower limit for large 

3.4,2, Lower bound for type I error probability of V 

From equation (2,4,10) we now have 
(3.4,2) Pr(lv^j| > ! > hjp^ = p,p) 

= 2 [M(h,h,p,p) + M(h,h,-p,p)] . 
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Similar to we denote the with single subscripts 

v^'s (i = 1,2, 


The Bonferroni inequality then gives 


(3.4.3) < Pr(V > Vq^) < S^, 

N 

where = 2 Pr(|Vj_l > v^) = a and 
i=l 

l<i<j<N j- u. j u. 

In this case, using Table 3.2.4, and equation (3.4.2), 


we get 


S 2 = 2 [n(n-l) (n-2)/2] [m(Vq(^,v^, 1/2 ,p) + M(Vq^,v^,-1/2,p) 

+ f(n-3)/2} 

= n(n“l)(n-2) [m(v^^, v^,l/2,p) + M(v^,v^^,-l/2,p) 

+ {(n-3)/2} M(Vqj^, v^,0,p)] , 

We get a lower bound for required probability by 
substituting this value of S 2 in (3.4,3) with = a. These 
lower bounds are tabulated in Table 3,4.2 for a = 0.01, 0.05? 

V = O and n = 10(2) 20(10) 50, For n < 10, the lower bound is 
equal to a, as for these values of n and a the nominal percentage 
points are exact. Similar to the case of U,here also the lower 
limit decreases rapidly as n increases. 
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3«5» Approximate upper percentage points of Murphy’' s tes-b 
statistic for two outliers 

In this section we restrict our attention to the 
important case of 2 ^ = O, and illustrate that approximate 
percentage point of Murphy's statistic can be obtained from the 
tabulated percentage points of studentized range statistic 
As shown in Section 3,1, the statistics U, M and 
equivalent for = 0, The relationships between the exact 
percentage points of these statistics are 

(3.5.1) MQj_(e) = [2(n-2)/n]^/^ U^(e), 

( 3 . 5 . 2 ) = [2(n-l) (n-2)/n]^'^^ Uj^( e) , 

where K^(.e) , exact percentage points 

of M, a-nd U respectively. The percentage points MQ|.(e) have 

been tabulated by Hawkins (1978) by evaluating a complicated 
integral for selected values of n < 30. Siimalated percentage 
points of have been tabulated by Barnett and Lewis (1978) 
for n = 5 ( 1 ) 10 ( 2 ) 20 ( 10 ) 50 , 100 , As mentioned in Section 3.3, 
the nominal percentage point u^, of U is exact for n < 10 and 
a = 0 , 05 . For smaller values of a, it is exact for slightly 
larger values of n. 

Next consider the statistics V and The relationship 

between the exact percentage points of these two statistics is 
obtained from (3,1,4) and is given by 

(3.5.3) = [2(n-l)]^/= Vj^Ce), 
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where V^( e) and e) are the exact percentage points of V 

and Tjjg respectively. The percentage points Tjjg were 

originally tabulated by David, Hartley and Pearson (1954) for 
selected values of n up to 1000, Their table has been extended 
by Pearson and Stephens (1964), and an abridged table is 
reproduced in Pearson and Hartley (1970) . 

The nominal percentage points u^ and v^^ actually provide 
an upper limit to the exact values U(j(e) and V|^(e) respectively* 
From the calculations performed in Section 3,3 and 3,4, we see 
that Uq^ and v^ are quite close to the true percentage points 
for small values of n. Further, from equations (2,3.1) and 
(2,3,2) it follows that 

(3.5.4) "a = '^20- 

Since both u^^ and v^^^ are upper limits and are close to 
the true values, hence we expect that the exact percentage points 
of U and V are approximately related by an equation similar to 

(3.5.4) , Thus, we are led to an approximation 

(3.5.5) Ujj_(a) =V2(^(e), 

where U'q^( a) stands for an approximate value of U^( e) • Now 
using ecjuations (3,5.1), (3.5,2), (3.5.3) and (3,5,5) we 
immediately get 

(3.5.6) = [l/£2(n-.l)}^/^] . 

^N3,a^^^ ~ [(n-2)/n3 • 


(3.5.7) 
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(3.5.8) MQ^(a) = [ (n-2)/{n(n-l) 

where stand for approximate value of 

and M^( e) respectively. To study the behaviour of this 
approximation, we compare the tabulated M^(e) values with the 
approximate values obtained from equation (3.5.8). Unfortunately, 
due to limited tabulation of Tj^^ e) and M^( e) , we can do 
it only for a = 0.05 and selected values of n. To extend our 
comparison, we also compare Tj^^ with the simulated values 

Tj ^2 ^N3 given by Barnett and Lewis (1978). These 

values are tabulated in Table 3.5.1, for n = 10(1) 20(5) 50(10)100, 
200, 500 and 1000. In this table entries in columns corresponding 
to Tjjg 2a^ ' '^N3 ®^® from Pearson and 

Stephens (1964) , Barnett and Lewis (1978) and Hawkins (1978) 
respectively. As can be seen that the approximation is remarkably 
good for almost all values of n for which the exact or simulated 
values are available. Some discrepancy could be due to the 
fact that Tjgg 2a^®^ available only upto three significant 
digits. Consequently, the last digit in 

in error. Further, due to sampling variation, the simulated 
value 1^2 ^® exact value. 

In view of such a close agreement, we recommend the use 
of approximate values for statistics M and Tj^^ given in 
Table 3.5*1, whenever the exact values are not available, 
especially for large values of n. 
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In this study# we have proposed the statistic U# which 
of course is related to M and therefore provide a 

table of approximate percentage points obtained from 

equation (3.5.6). Again due to limited values 

available in Pearson and Stephens (1964)/ we tabulate ^^^( 3 ) 
for n = 10(1) 20(10) 100# 200, 500, 1000 and a = 0,005, 0,025 
and 0,05 only. Prom the observations made above, U^(a) is 
expected to be quite close to the true value U|^(e), Thus,- for 
Murphy^ s test we need not use the nominal percentage points 
but can use approximate percentage points U^(a) • It should 
however be remembered that while u^^ restricts the probability 
of type I error to a, we cannot make any such claim about UQ.(a) • 
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Ti-iBLE 3,3.2* Nominal upper percentile points of U and 
Murphy^ s statist-ic 


n 1 cc 

0. 

01 

0, 

05 

0. 

01 

0. 

05 


M 

a 


M 

a 


M 

a 

a 

'"a 

5 

0.9859 

1.080 

0.9587 

1.050 

0.8467 

0.928 

0.7646 

0.838 

6 

0.9700 

1.120 

0.9 326 

1.077 

0.8364 

0.966 

0.7598 

0.877 

7 

0.9518 

1 .138 

0.9074 

1.085 

0.8251 

0.936 

0.7523 

0.899 

8 

0.9330 

1.143 

0.3840 

1,083 

0,8137 

0,997 

0.7436 

0.911 

9 

0.9144 

1.140 

0.8624 

1.076 

0,8023 

1.001 

0.7344 

0.916 

10 

0,8964 

1,134 

0.8424 

1,066 

0.7911 

1.001 

0,7251 

0.917 

11 

0.8792 

1.125 

0.82 39 

1.054 

0,7802 

0.998 

0.7158 

0.916 

12 

0.8628 

1.114 

0.8069 

1.042 

0.7696 

0.994 

0.7067 

0.912 

13 

0.8472 

1.102 

0.7910 

1.029 

0.7595 

0,938 

0.6978 

0,908 

14 

0.3325 

1.090 

0,7762 

1.016 

0.7496 

0,982 

0.6892 

0,902 

15 

0.8186 

1.078 

0,7623 

1,004 

0.7402 

0,974 

0,6808 

0,896 

20 

0.7583 

1 .018 

0.7044 

0.945 

0,6973 

0.9 36 

0.6430 

0.863 

25 

0.7115 

0.965 

0.6598 

0.895 

0.6624 

0,899 

0.6112 

0.829 

30 

0.6730 

0.920 

0.62 39 

0.852 

0.6324 

0.864 

0.5841 

0,798 

35 

0.6409 

0.880 

0.5941 

0.816 

0,6064 

0.833 

0,5607 

0,770 

40 

0.6135 

0.846 

0,5688 

0.784 

0.5837 

0.305 

0.5402 

0.745 

45 

0,5897 

0.815 

0.5470 

0.756 

0,5637 

0.779 

0.5220 

0.722 

50 

0.5688 

0.788 

0.5279 

0.731 

0.5458 

0.756 

0.5058 

0.701 

55 

0.5503 

0.764 

0.5109 

0.709 

0.5297 

0.735 

0.4913 

0.682 

60 

0,5337 

0.742 

0.4957 

0.689 

0.5151 

0.716 

0.4780 

0.665 

T.iBLE 3.3.3. 

, Nominal and tabulated critical values of . 


nia 


0.05 



0.01 

Nominal 


Tabulated 

Nominal 

Tabulated 

10 

3.192 


3.18 

3.397 

3,40 

20 

4.113 


4.11 

4.431 

4,41 

50 

5.120 


5.06 

5.517 

5.51 


5.743 


5.62 

6.164 

6.06 
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TABLE 

3.3,4, Nominal 

upper 

percentile points of 

V and 


for ^ = 




nia 

0.01 


0,05 



T 

N6 

a 

^N6 

3 

1 ,0000 

2,000 

0.9997 

1.999 

4 

0,9983 

2.445 

0,9917 

2.429 

5 

0,9911 

2.803 

0.9740 

2.755 

6 

0.9788 

3.095 

0.9625 

3.012 

7 

0.9636 

3.338 

0.9 302 

3.222 

8 

0.9470 

3.543 

0.9085 

3.399 

9 

0.9301 

3.720 

0.8379 

3.562 

10 

0.9133 

3.875 

0.8686 

3.685 

11 

0.8970 

4.012 

0.8504 

3.803 

12 

0.8813 

4.134 

0,8335 

3,909 

13 

0,8663 

4,244 

0.8176 

4.005 

14 

0,8519 

4.344 

0,8026 

4.093 

15 

0.8382 

4.435 

0.7386 

4.173 

16 

0.3251 

4.519 

0.7754 

4.247 

17 

0,8127 

4.597 

0,7630 

4.316 

18 

0,8008 

4.669 

0.7512 

4.330 

19 

0.7894 

4,737 

0.7400 

4.440 

20 

0.7786 

4.800 

0.7294 

4.496 

30 

0.6917 

5.268 

0.6461 

4,921 

40 

0.6308 

5.571 

0.5889 

5.201 

50 

0,5849 

5.790 

0,5462 

5.407 

60 

0.5487 

5 .960 

0,5126 

5.568 

70 

0.5191 

6.098 

0.4852 

5.700 

80 

0.4943 

6.213 

0.4623 

5.811 

90 

0.4731 

6.311 

0,4427 

5.906 

100 

0.4546 

6.397 

0.4257 

5.990 
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TABLE 

3.4,1. Lower 

limits for 

tvoe I error •orobabilitv bv 


using 

for the 

statistic U, 


V 

0 


5 



ma 

0.01 

0.05 

0.01 

0.05 


5 

0.01000 

0,05000 

0.01000 

0.05000 


6 

0.01000 

0,05000 

0.01000 

0.04999 


7 

0.01000 

0.05000 

0,01000 

0,04994 


8 

0.01000 

0.05000 

0.01000 

0.0498 3 


9 

0.01000 

0,05000 

0.01000 

0.04965 


10 

0.01 000 

0,05000 

0,01000 

0.04942 


11 

0.01000 

0,05000 

0.01000 

0,04915 


12 

0.01000 

0.04999 

0,00999 

0.04885 


13 

0.01000 

0,0499 3 

0.00999 

0.04852 


14 

0,01000 

0.04980 

0,00998 

0.04817 


15 

0.01000 

0.04961 

0.0099 6 

0.04781 


20 

0.00997 

0.04805 

0.00986 

0.04585 


25 

0.00987 

0.04602 

0.00971 

0.04383 


30 

0.00972 

0,04392 

0.0095 3 

0.04184 


35 

0.00953 

0.04183 

0.009 36 

0.03997 


40 

0,00936 

0.03991 

0.00914 

0.03800 


45 

0.00913 

0.03794 

0.00898 

0.03642 


50 

0.00838 

0.03619 

0.00856 

0.03472 
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TABLE 3.4.2* Lower limits for type I error probabilliry by 
using for the statistic V, 


n 1 a 

O.Cl 

0,05 

10 

0.01000 

0.05000 

12 

O.OIOCO 

0.04996 

14 

0.01000 

0.04970 

16 

0.01000 

0.04924 

13 

0.00999 

0.04864 

20 

0.00996 

0.04798 

30 

0.00973 

0,04447 

40 

0.009 3S 

0.04105 

50 

0.00909 

0.03801 
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TABLE 3*5.1. Comparison o£ approx^ate and exact, -percent-age 

points of t^e for two outliers 

for cc = 0.05 > 


n 




M^(a) 

M^(e) 

10 

3.57 

3.193 

3.18 

1 .064 

1.066 

11 

3.68 

3.329 


1.05 3 

1.055 

12 

3.78 

3.451 

3.44 

1.040 

1.043 

13 

3.87 

3.560 


1.028 

1.030 

14 

3.95 

3.657 

3,66 

1.014 

1.018 

15 

4.02 

3.742 


1.000 

1.005 

16 

4.09 

3.826 

3.83 

0.988 


17 

4,15 

3.898 


0.975 


18 

4.21 

3.969 

3.96 

0.963 


19 

4.27 

4.039 


0.952 


20 

4,32 

4.098 

4.11 

0.940 

0.944 

25 

4.53 

4.345 


0.887 

0.89 3 

30 

4.70 

4.541 

4.56 

0.843 

0.848 

35 

4.84 

4.700 


0.806 


40 

4.96 

4.834 

4.84 

0.774 


45 

5.06 

4.946 


0.746 


50 

5.14 

5.036 

5.06 

0.719 


60 

5.29 

5.201 


0.677 


70 

5.41 

5.332 


0.642 


80 

5.51 

5.441 


0.612 


90 

5.60 

5.537 


0.587 


loo 

5.68 

5.623 

5.62 

0.565 


200 

6.15 

6.119 


0.434 


500 

6.72 

6.706 


0.300 


1000 

7.11 

7.103 


0.225 
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TABLE 3,5 *2 • Approximate upper percentile points of the 

statistic U, 


B 

0. 

05 

1 

0.025 1 

1 0.005 

■ 


“■a'®* 

T ( e) 1 

1 


^N6,2a^®^ 

1 

10 

3.57 

0.841 

3.635 

0.869 

3.8 75 

0.913 

11 

3,68 

0.823 

3.80 

0.850 

4.012 

0.897 

12 

3.78 

0.806 

3.91 

0.8 34 

4,134 

0,381 

13 

3.87 

0.790 

4.00 

0.817 

4.244 

0,866 

14 

3.95 

0.775 

4.09 

0.802 

4.34 

0.851 

15 

4.02 

0,760 

4.17 

0.788 

4.44 

0.839 

16 

4.09 

0,747 

4.24 

0.774 

4.52 

0.825 

17 

4.15 

0.734 

4.31 

0.762 

4.60 

0.813 

18 

4.21 

0.722 

4,37 

0.749 

4.67 

0.801 

19 

4,27 

0.712 

4.43 

0,738 

4.74 

0,790 

20 

4.32 

0.701 

4.49 

0,728 

4.80 

0,779 

30 

4,70 

0.617 

4.89 

0.642 

5 .26 

0.691 

40 

4.96 

0.562 

5.16 

0.584 

5 .56 

0.630 

50 

5.14 

0.519 

5.35 

0,540 

5.77 

0.583 

60 

5.29 

0.437 

5.51 

0.507 

5.94 

0.547 

70 

5.41 

0.461 

5.63 

0.479 

6,07 

0.517 

80 

5.51 

0.438 

5.73 

0.456 

6.18 

0.492 

90 

5.60 

0.420 

5.82 

0.436 

6.27 

0.470 

100 

5.68 

0,404 

5.90 

0.419 

6.36 

0.452 

200 

6.15 

0.308 

6.39 

0.320 

6.84 

0,343 

500 

6.72 

0.213 

6.94 

0.220 

7.42 

0.2 35 

1000 

7.11 

0.159 

7.33 

0.164 

7.80 

0.175 












CHAPTER IV 


APPLICATION TO A TWO-WAY LAYOUT 
4 » 1 > Introduc-tion 

2 

Two outliers in a random sample from N(^l,a ), discussed in 
Chapter III have been studied in maximum detail. After this, 
the most widely studied case is the detection of outliers in a 
two-way layout having a single observation in each cell* For 
example, GentlCTian and Wilh (1975a), John and Draper (1978), 

Galpin and Hawkins (1981), Bradu and Hawkins (1982) etc* have 
discussed detection of one or more outliers in two-way tables* 

In this chapter, we will apply the general theory discussed In 
Chapter II to this case. For convenience we restrict our 
attention to ^ = 0 and use double subscripts in this chapter* 

In Section 4.2/ we derive the statistics for detectir.g 
two outliers. We then analyse the shape parameter existing 
between different u. , , . 's or v. . j . 's in Section 4,3-,and 

discuss how these quantities can be used for finding bounds 
for exact percentile points. Finally in Section 4.4 we obtain 
the percentile points by Monte Carlo method and compar^ them 
with the nominal upper percentile points obtained in Section 2.3. 

4.2. Test statistics 

The model for a two-way layout with r rows and c columns 
and single observation in each cell is 
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(4.2.1) + ■yj# i = 1/2/. ../rj j = l/2/-».,c 

where M is the general mean, is the ith row effect and Tj 

is the column effect. The total number of observations 
are thus r.c = n. 

Rewriting (4,2.1) in the usual linear model form, we have 
E(Y) = X 13, 

OJ CJ 4>3 

where 



and X is the design matrix having the rank k = r+c-1. 

The residuals for this model are 

®ij = yy - yi.- y.j + y,.- ^ ^ J = >=» 

where 

y^j is the observation in (i/j)j^ cell, 

Yi = E y. /c / 

y ,• = 2 Y-. /r, and 
• J i 


y = E E y . /rc 
1 J 
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The error sxm of squares is 

= S2 e|j , 

0 

which is also equal to as we are only considering the case 
y = O, The degrees of freedom for are 
p = n-k = ro- ( r+c-1 ) = ( r-l ) ( c-1 ) . 


The correlation coefficient betweai any two residuals 


e . . and e. ■ depends upon their position in the table* It is 

-* n-*! 


given by 

(4.2.2) = R(ij,ijjj^) = 


if i ij^ and j ^ 
2 if i = ij^ and j ^ 


r^ if i ^ i^ and j = / 


where r^^ = l/[ ( r-1) (c-1) ] , = -l/(c-l) and r 3 = -l/(r-l) , 


and R(ij, ij_J 3 _) is used for notational convenience. Further 
the variance of each residual is \ = ( r-1) (c-l)/rc. 


Ejcpressions for U and V become complicated due to double 
suffix notation. However, U is the maximum of = (^'^) random 
variables and is given by 


U = Max ‘ i 1 i * ^ 1 1 < j < Jl 1 c}# 

= 1 < i < il < 1 i J < 1 < Jl i =)] . 

Similar ejqsression for V also holds. 

For this case, the random variables needed for defining 
statistics U and V are 




S K 


172 


[ 


e. . 

-J-L 


®il3l 


{2(l+p(e.,/e 




))} 


172 


(4.2.3) 
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and 


(4,2.4) 


^ [■ 


e. . - e. . 
ij 


C2(l-P(e. ./e. .. ))} 
xj x^Jl 


172 


3 . 


i f Xj^ — and — l/2/»»*#c» 


In actual practice one does not have to evaluate all these 
(Quantities. Usually^ calculations for few residuals having 
extreme values are sufficient. This has been illustrated in 
Example 2.1.1 of Section 2.1. It is observed that calculations 
for four or five residuals give U, Similarly^ calculations 
for largest three and smallest three residuals are sufficient 
for V in most cases. Only in some extreme cases, one may have 
to do additional calculations. 


4.3. Calculation of shape parameters 

For finding the joint distribution of u. . . . and 

u. . . . or of V. . . . and v. .• j ^ # it is sufficient to 
^3^3' V4 hh'^^2 ^3^3' V4 

consider the four residuals which are involved in defining 

these (Quantities, Gentleman (1980) has shown that there are 

60 such combinations having distinct variance covariance matrix 

in a two-way table with r and o greater than or equal to 4. 

For other values of r and c, the number of these matrices is 

less than 60, These matrices can be identified with appropriate 

binary numbers or with their corresponding decimal numbers. The 

decimal n\ambers for these 60 matrices are given by Gentlonan, 

In Table 4.3,1, we reproduce these decimal numbers as follows t 
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Suppose R denotes the correlation matrix of the residuals 

fO 


j '^i^ j ^ "*"1 ^ "^2 ^ *^3 ^ "^4 * l/2/***/^ and 


and e.. ■ , , i, 

4J4 

12 ' J3'-^'4 ~ 1/2 /.../C* Gentlonan (1980) has shown that a 


C.! -i / ®J ■< t 

H -^1 ^ 2'^2 


decimal number can be used to represent a variance— covariance 
matrix* Same method can be extended to the correlation matrix 
R^ since variances of all residuals are equal* We describe the 
procedure of Gentleman briefly. A given R can be uniquely 
represented as two 4x4 binary matrices which will be denoted by 
H and J, Denoting the 4 cell subscripts as i ±^f 3 ^) , • • . ,i 3^) t 

the (a,b) element of H and J are defined as follows : 


H(a,b) = 


J(a,b) = 


1 

0 

1 

0 




otherwise, 

1^ J'a = ib' 
otherwise. 


where a = l,2/.**/4> b = 1,2,.. .,4* As an exanrple, suppose we 
want to express the correlation matrix of ^^3'®2l'®22'^23^ 
decimal fom. The matrix R can be written down immediately by 

«o 

using equation (4*2*2) • Thus 


R = 

fo 


-2 

1 


"2 

1 
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Further/ the matrices H and J are given by 



~ 1 

0 

0 

0 ' 


*"1 

0 

0 

1 “ 

H = 

0 

1 

1 

1 

, J = 

0 

1 

0 

0 


0 

1 

1 

1 


0 

0 

1 

0 


0 

1 

1 

1 _ 


1 

0 

0 

1 


Since H and J are symmetric and all diagonal elements of 
both the matrices are necessarily 1 , a given R can be uniquely 
represented by a binary number consisting of 12 binary digits 
in the lower triangles of H and J (omitting diagonals) , ordered, 
say, by rows, with elements of H preceeding those of J* Thus 
the binary number corresponding to the example considered above is 
OOlOl 10001 OO, For the sake of brevity, this binary number 

is then converted to the corresponding decimal number. For the 
example cited above the decimal number is 708. 


The method for writing R from a given decimal number is 
just the reverse. We first calculate the corresponding 12 digit 
binary member, and obtain the matrices H and J. For the (a,b) 


R(a,b) = 


Ken 

we 

have 



1 

if 

H(a,b) = 1, 

J(a,b) 

= 1 # 

^1 

if 

H(a,b) = 0, 

J(a,b) 

= O, 

^2 

if 

H(a,b) = 1, 

J(a,b) 

= 0, 

^3 

if 

H(a,b) = 0, 

J(a,b) 

= 1 . 


As an illustration, for the decimal number 530 given in 
Table 4*3,1, we have the binary number equal to OOlOOOOlOOlO, and 
the matrices H and J as 
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This, for example, is 


the correlation matrix of 




The entire work can be accorrpl ished conveniently on a 
computer* Gentleman (1980) has not counted the number of matrices 
of each type. Although not directly needed for our study, yet 
it has been done for 3x3 and 4x5 layout, by generating all 
possible (^) combinations of 4 residuals for each case and 
obtaining the corresponding correlation matrix. For each matrix, 
corresponding decimal n-umber was obtained by the process 
described above. The decimal numbers ■were •then counted to give 
the desired frequencies, Ihese are also reported in Table 4.3.1, 
Note -hhat for a 3x3 layout, there are only 38 decimal numbers, 

9 

In other words, if we consider all possible (^) =126 
combinations of 4 residuals, and obtain their correlation 
matrices, then there are only 38 such matrices. In addition 
r = c = 3 implies that ^2 = -l/(c-l) = -0.5 and r^ = -l/(r-l) =— 0,5 . 
This in -turn reduces the number of distinct correlation matrices 
to only 32, In this case, correlation matrices corresponding 
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to 6 pairs of decimal nximbers 

(72,513), (96,2049), (120,3585), (544,2056), (545,2l20) and 
(736,2059) are identical. Thus essentially we have only 32 
distinct correlation matrices for this case. 


4,3.1. Shape parameters of the bivariate distribution of 
u. , , • and u. . , , 

^ 1^1 ^^ 2^2 ^ 3 ^ 3 ^ ^ 4^4 


The shape parameter plays an important role for the 

determination of bivariate probabilities. In the present 

case, for the distribution of u. ,• , j and u. , , . ,it is 

Hh'^2^2 ^ 3 ^ 3 ' ^ 4^4 

given by 


(4.3,1) p- (u, , , , , u 

1 


) 


^ 3 ^ 3 ' ^ 4^*4 

R( i^ , 1313) +R( ^4^4) ^2^2' 3^ ^2^2' ^4^4^ 

2[{l+R(ij^j^,i2J2)} U+J^(i33’3#^4J4)i]^^^ 


where ij^, 12 , 13 /!^ = 1 , 2 , ...,rf 3*1' J 2 '^^' 3'-^4 ” 1 , 2 , ...,c, and 
%^1 ' ^2'^'2^ etc. are defined at equation (4.2.2) • 


Now for defining u. , , and u. • , , we need 4 

H^I'^2^2 ^3^3'^4^4 

/ 4‘\ 

residuals. The correlation matrix R of these contain (_) —6 
distinct correlations. Out of these six, two occur in the 
denominator of the e 3 <pression for the shape parameter given 
in equation (4,3.1), Thus any correlation matrix R of 4 
residuals can give rise to at moot «=15 different shape 
parameters. Hence, there are at most 900 values- of ths_ shape 
parameters among the 60 different matrices. However, out of 
these 900 values only 40 are distinct. Out of these 40, 
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some may not exist or may coincide for special values of r 
and c. We shall denote than by (1 = 1/2/.../40)* Also/ 

since there are N = (^) distinct u. . - • *s hence the niimber 

of bivariate distributions and shape parameters involved are 
Consequently/ for 3x3, 4x5 and 5x6 layouts the total 
number of shape parameters are 630/ 17955 and 9 4395 respectively. 
Different types of combinations of u . . . . and u. . . . , 

corresponding expressions for shape parameters P in terms of 
r^/r 2 and r^^ their frequencies for 3x3/ 4x5 and 5x6 layouts 

X. 

and numerical values of p for these cases are tabulated in 
Table 4.3.2. 

When r = c = 3/ 13 shape parameters p^/ for i = 1,2/3/ 
4/5/17/18/23/24/32/34/39 and 40 do not exist. Out of remaining 
2 7/ only 11 are distinct. The p /s ^ich merge together in 
sets are tP^.Pg, Pjg.Psg' "so' ‘’35' ^36> » ' 

‘^ 12 '^ 19 '^ 20 * ^ ^^ 37 '‘^ 38 * * 

For r = 4/ c = 5/ 37 of these P^s are distinct. The 
p values having subscripts numbers 17/ 19 and 2l in Table 4.3.2 
merge with that of p^ whose value is zero. For r = 5/ c = 6/ 
all 40 values of P^s are different. 

For r = C/ we have r 2 = r 2 and several P±^ merge 
with other values. In this case there are only 19 distinct 
p^ values. Ihe P^ values for the following sets of suffixes 
as given in Table 4.3.2 are identical : 
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{2/3,6}, {4/5}/ {7/8}/ {10/11}/ {13/14}, £15/16}/ {17/18}/ 
{19/20}/ {21/22}/ {23/24,28}/ {25/26,35,36}/ {29,30}, 
{31/32/33/34}/ {37,38}, {39,40}. 

These values along with for i = 1,9,12 and 27 
constitute the 19 distinct values. 


4,3,2, Shape parameters of the bivariate distribution of 




and V. 


^3^3^ ^4-^4 


The shape parameter 
3-^' 3' ^4^4 


V. . j . and V. .. . ^ 


P* of the distribution of any two 
is given by 


(4.3.2) P((v. . . , . 

1 1 ^ 1 ^/ 12^2 


V. 


L3J3/-4J4 


_ 3 ^ ^ ^1 ' ^4^4 ^ ^^ 2 * ^ 3 ^ 3 ^ ^2 -^2 ' ^4^4^ 

2(]{1—R( ^2'^2^ }{1*"R( ^3J3^^4J4^3 ^ 

where ij^/i 2 /i 3 /i^ = 1 , 2 /, ../ry 3^* 32* ^2*^4 ~ ^> 2 /...,c and 
R( ij^ jj^/ i 2 12 ) etc, are defined at equation (4,2 ,2 X. 


For a two-way table, the total nvimber of distinct p's 
is now 43, We label them as (i = 1,2, • , « ,43) J^s aaid before, 

total niimber of bivariate distributions and shape parameters 
are ( 2 )/- where N = ( 2 )* 


Different types of combinations of v. . . , and 

H^l' 2^2 

V. in-/ corresponding es^pressions for shape parameters 
^3^3' 4^4 

p' in terms of ilf^2 ^3' frequencies of occurrence 

for 3x3/ 4x5 and 5x6 layouts and niimerical values of p' are 
given in Table 4.3,3, An asterisk sign in the frequency column 
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indicates that the same value has occured in some preceding 

row (shown in brackets) for some other combination and the 
frequency of it is merged with the earlier one. Here the 
frequencies of different combinations are difficult to count 
separately as was done for 'u' case, because in this case the 
order of each residual v/hich constitutes v, . . . and 

V. . . , matters in evaluation of P'. s. Note that for 

application purposes^ we only need numerical values of s 

along with their frequencies# and an abridged table# similar 
to Table 3.2.4 of Chapter III is sufficient. 

When r = c = 3# only 13 of the P^ s are distinct. The 
rest of them merge with one or the other. For the case of 
r = 4# c = 5 only 39 out of 43 are distinct. Numerical values 
of <^28 ^37 equal to p^^ , P^j^ # P 34 and P'^ 

respectively for this case. When r = c > 4# there are 22 
distinct shape parameters. In this case also# many of the 
pi s merged together. The p< values for the following sets 
of suffixes as given in Table 4.3.3 are identical i { 2 #3}#. 
£4#5}# {6#7}, {8#10}# {9#19, 21/23}, £ll#l2#13}# £1S#16}/ 

£17,18}/ £24/25, 34#35}# £26,27}, £29/30,3^)# £31,32}# £38,39}, 
£40/41}, £42,43}. 

These values along with p£ for i=di4.4, 20/22,28 ,36 and 37 
constitute the 22 distinct values. 
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4«4« Percentile points 

In Section 2.3/ we have obtained tables of nominal upper 
percentile points for several values of n and 3c. These tables 
-can be used here for values of r and c satisfying 3 < r/ c < 11 
and r+c < 14. For some other values also we can use these 
tables. For comparison purposes, we now obtain lower and upper 
bounds for exact percentage points and for some 

selected values of a for two'-way classification with 
(r,c) = (4,5) and (5,6). Percentile points by Monte Carlo 
procedure for some additional values of r and c have also been 
obtained. 

As discussed in Section 2.4, bounds for the exact percentag< 
point UQ^(e) of U can be obtained by using the Bonferroni 
inequalities. An upper limit for U^j( e) is u^, given in equation 

(2.3.1) , while a lower limit Uv^ for U„(e) is obtained from 

^ ' 
equation (2.3.3). Due to bivariate probabilities involved, a 

direct solution of equation (2.3.3) is not possible. However, 

a recurring procedure could be adopted for small values of r and 

c. For the sake of notational simplicity, relabel all 

"i where 

H = , Then, 

U =: Max u. 
l<i<N ^ 

and 

(4.4.1) ~ i £ S^(u), 
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where 

(4*4,2) Sj^(u) = Pr Cu^ > u) 

and 

(4.4,3) S_(u) = 2 Pr(u. > u, u. > u) , 

l<i<J<N 

For finding a lower limit/ we essentially have to solve 
S^(u) - ^2^^^ “ this, we start with the upper limit value 

u^. Corresponding to this value/ we first determine those 
bivariate probabilities appearing in equation (4.4,3), vhich are 
not equal to zero. If p is the shape parameter between u^^ and 
Uj, then from equation (2.4.11)/ we sea that 

M(U(^/U^/P/p) = Pr(uj_ > Uj > Ujj) =! 0 if < 2u^-l = P* (say). 

As an example, for a = 0.05 and for a 4x5 table we get 
P'^ = 0.359 3. Now referring to Table 4.3,2, we see that bivariate 
probabilities corresponding to for i = 9, 13, 14, 15 and 16 

are non- zero. Thus five bivariate probabilities have to be 
evaluated. However, the number of bivariate probabilities to 
be evaluated increases rapidly as r and c increase. For example, 
for a = 0,05, and r*5, c=6we have p^ = 0.0382 and the 
number of bivariate probabilities to be evaluated increases 
to 16, 

After evaluating all the non- zero bivariate probabilities, 
M(u^/U^,p^,p) we multiply each of them with their respective 
frequency, which is again given in Table 4.3.2. The sim of these 
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would then give S 2 (uq^) and we obtain a lov/er limit of the 
significance probability^ that is a-S 2 (u^). It is then checked, 
whether this lower limit deviates from a by more than 10 ^ (say). 
If yes, then let = a + and determine the corresponding 

value. Again the lower limit of the significance probability, 
that is ~ ^2 ^ calculated and compared with a. This 

process is repeated until the lower limit for Pr(U > u) does 
not differ from a by more than 10 Ihe value of u v^ich gives 

this final lower limit of the significance probability can be 
considered as a lower bound for the true ath percentile point 
of U. We denote this value as Note that Ujj.^, subject to 

calculation approximations, satisfies 

- S^Cu.^^) = a. 

Similarly , the bounds for V^( e) are also calculated. Here 
again we denote ^i2'’^ll,13' * * ‘^^rCc-l) v^/V^ , . , .,v^, 

where N == and 

V = Max I V . 1 • 
l<i<N ^ 

Now, we have 

S^(v) - S 2 (v) < Pr(V > v) < S^(v), 
where S^(v) = C^) PrClv^^l ^ 

S„(v) = 2 Pr(jv. I > V, |v, 1 > v) . 

l<i<j<N ^ 

Similar to the case of U, we again begin with the nominal 
percentile value Now from equation (2.4,13) we have 
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Pr(jv^J > v^, IVjl > Vg.) =0 if iP'l < P'* = 2v^ - 1, 

where P* is the shape parameter between and Vj. If p'^ 
is negative/ then of- course, xve have to evaluate all bivariate 
probabilities. But if p'^' is positive, then we have to 
evaluate only those bivariate probability terms for which 
iP'l > P^ . Note that 

pHivj^I >Vg./lVjl > = 2 [M(v^/Vg^/P',p) + M(v^/V^/-p'/p)] / 

and conseguently, the value of this quantity for p* and — p'' is 
equal. For a = 0,05, r = 4 and c =5, we have p^* = 0.4331. 
From Table 4.3,3, we see that the bivariate probabilities 
which have to be evaluated correspond to P^ for 
i = 1,2,4,5,8,20,22,29,36,38 and 39. Out of these, p£,pj,p^,p' 
and P'q are equal to -p'g , -p'g , “*-^ 39 ' “^29 ”^22 

respectively. Thus essentially 6 bivariate probability terms 
like PrClv^l > Vg^, IVjl > Vg) have to be calculated. This 
involves an evaluation of M(Vg,Vg,P,p) probabilities for 
12 different (J>oth positive and negative) values of p , These 
are evaluated and then multiplied with suitable frequencies 
obtained from Table 4,3,3 and added to get S 2 (Vg). The same 
iterative procedure as for u^g is then followed for obtaining 
the bound 'x^g. 

The bounds forUg(e) for (r,c) = (4,5) and ( 5 , 6 );a = 0,01, 
0,05 and 0.10 are given in Table 4.4,1. Similarly the bounds 
of Vg( e) for r = 4, c = 5 and for the same values of ct are 
tabulated in Table 4,4,2. 
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Some percentile points for cases (r/C) = (3,3), (4,5), 
(5,4), (5,6), (6,5), (6,10) and (10,6) are obtained by Monte 
Carlo method also. Ihese are given in Table 4.4.3 and Table 
4.4.5 for the statistics U and V respectively. 

The method followed for obtaining simulated percentile 

points was as follows. First rc standard normal variates for 

a rxc table were generated. Then the residuals were obtained 

in usual manner and u. . . . values were calculated. Due to 

^ 1 -^ 1 ' ^ 2^2 

cost considerations u. . _• . values corresponding to five 

^ 1 ^ 1 ' 2^2 

largest residuals were calculated. Their maximum then gives 
the value of statistic U. The process was repeated 1000 times 
to get 1000 values of statistic U, These 1000 values of U were 
arranged in descending order of magnitude to obtain an estimate 
of upper 100a p^ercent point. 

The entire process was repeated 25 times, and the average 
of these 25 values was calculated for obtaining simulated upper 
100a percent point of U. Due to cost consideration, the average 
of only 15 repetitions was taken for (r,c) = (6,10) and (10,6) 
cases. These values are given in Table 4.4.3 for U for 
a = 0.005 , 0,01, 0.025 , 0,05 and 0,10, Due to symmetry, the 
percentile points of test statistic with r and c interchanged 
should remain unchanged. Consequently, the mean of the 
percentage points obtained for (r,c) and (c,r) combinations is 
also tabulated. This gives the desired percentage point by 


Monte Carlo method 
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Exactly similar procedure was followed to obtain percentag 
points of V exceot that now v. . .• . values were calculated 

for all combinations of 3 largest and 3 smallest residuals only. 
Simulated percentage points for U denoted hy s) and for V 
denoted by Vj^(s) are given in Tables 4.4.4 and 4.4,6 respectively, 
Corresponding nominal percentage points obtained in Section 2.3 
are also provided in parenthesis. As can be seen from Tables 
4.4,4 and 4.4.6, for both U and V, nominal percentage points 
provide a fairly good approximation for true percentage points 
even for moderately large values of (r, c) and a. Thus for 
r := 6, c = 10 and a = 0,10, simulated percentage point for U 
is Uq^(s) = 0,5281 while nominal percentage point is u^^ = 0.5385. 
Similarly for r = 6, c = 10 and a = 0,20 we have s) = 0,5257, 
while the nominal percentage point is v^^ = 0,5 385. 

Due to sampling variations, some of the simulated values ai 
outside the corresponding lower and upper bounds given in 
Table 4.4,1 and Table 4.4,2, especially for small values of a, 
which indicates that a much larger n^lmber of repetitions have to 
be performed for satisfactory percentage points by Monte Carlo 
method. Even otherwise, simulated percentage points usually 
have accuracy of only 2 or 3 significant digits. We therefore 
recommend the use of nominal percentage points u^^ and Vq. for 

i 

U and V respectively, which are quite easy to evaluate. 
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TABLE 4.3.1, S ixty dis-tinct. matrices and their frequency 
of occurrence fcr a 3x3 and a 4x5 table. 


Decimal 
niimber of 

Frequency 

Decimal 

Frequency 

3x3 table 4x5 table 

liLirTl jO xT C? jH 

the matrices 

3x3 table 4x5 tabl 

the matrices 


0 


120 

5 30 

3 

40 

1 


60 

533 

3 

40 

2 


60 

544 

3 

120 

4 


60 

545 

3 

40 

B 


60 

550 

3 

40 

11 


20 

704 


120 

12 


20 

708 

3 

60 

16 


60 

720 

3 

60 

IS 


20 

736 

3 

60 

21 


20 

2048 


240 

32 


60 

2049 

3 

120 

33 


20 

2050 

3 

120 

38 


20 

2052 

3 

120 

56 


20 

2056 

3 

120 

63 


5 

2059 

3 

40 

64 


240 

2060 

3 

40 

66 

3 

120 

2064 

3 

120 

68 

3 

120 

2066 

3 

40 

72 

3 

120 

2069 

3 

40 

76 

3 

40 

2112 


180 

80 

3 

120 

2114 

6 

120 

82 

3 

40 

2116 

3 

60 

96 

3 

120 

2120 

3 

60 

102 

3 

40 

2128 

6 

120 

120 

3 

40 

2130 

9 

60 

512 


240 

3584 


120 

513 

3 

120 

3585 

3 

60 

514 

3 

120 

3586 

3 

60 

516 

3 

120 

3588 

3 

60 

5 28 

3 

120 

4032 


20 


126 = 4845 = 


Total 
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TABLE 4.3.2. Different combinations of u. . . a 's with shape 

■ 

parameters and frequency of occ ur renco for 3x3# '4x5 
1 

and 5x6 tables * 


Subs- 

Type of 
combinations 

Formula of 

Precjuencies and the 
the *P^*s 

value 

c riots 
of 
'P.^s 

the * Pj_'s 

3x3 

table 

4x5 

table 

5x6 

tablt 

1 

^11,22^^33,44 

2ri 

nr+ip- 

0 

360 
.15 38 

5400 

.0952 

2 

^11,12^^23,24 

2ri 

rj:+^ 

0 

180 

.2222 

900 

.1250 

3 

^ll,2l'^32,42 

(l+r^) 

0 

60 

.2500 

450 

.1333 

A 

^11,22'^33,34 


0 

720 

.1849 

5400 : 
.1091 

4 

[ (l-i-r^)(l+r2)]^'^ 

CT 

^1,22^^33,43 

2ri 

0 

360 

.1961 

3600 

.1127 

5 

[ (1+r^) (l+r3)]^^^ 


^11,12'^23,33 

2r 

^1 

Q 

360 

.2357 

1800 

.1291 

D 

[ ( 1 +^2 ) ( 1 tt^)! 

1.0000 

7 

^ll,2l'^ll,31 


9 

-.5000 

60 

.0000 

180 

.1667 

8 

^1,12'^11,13 

^ 1 1 +r2 ) 

9 

-.5000 

120 

.1667 

300 

-2500 

9 

^ll,22'^ll/33 

( 1 +3r2^ ) 

2(l+rj^^ 

18 

.7000 

720 

.5769 

3600 

.5476 

,0 

^11,12''^11,22 

(l+rj^+r2+r3) 

2[ ( l-i-r^^ ) ( l+r2 )] 

36 

.1581 

240 

.2774 

600 

.3273 


^The values shown in the bottom row are the numerical values of 
shape parameters. 
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Subs- 

Cr-i-DtS 

of' 


Type of Formula of 

combinations the 


Frequencies and the value of 
tne 'P^'s 


3x3 

table 


4x5 

table 


5x6 
tabl e 


^ 11 , 21 " 11,22 


^ 11 , 12 ^^ 11, 21 


^11,22' 11,32 


( l+r^+r2+r2) 

2 [ ( 1 -!-r^ ) ( 1 H-r^) ] ■ 
( 1 T-r^ +r2 tr^ ) 
2 [(l+r 2 ) (l+r^)]' 

(l+2r]i +r^) 


36 

.1581 


36 

.2500 


18 

.4000 


240 

.2942 


240 

.3536 


2 40 
.3846 


600 

.3381 


600 

.3873 


900 

.4048 


^11/22'^11,23 


(l+2r2^+r2 ) 


18 

,4000 


360 

.4231 


1200 

.4286 


Ul, 22' 22,32 


^11,22' 22/23 


^11,22' 23,34 


(l-!-2rj^+r3) 


2[(l+r3^) (l+rg)]^'^^ 
( l+2rj^+r2) 
7jl+r^) (l+r^)]^^^ 


( 3rj_+r2) 


36 

.6325 


36 

.6325 


480 

.4903 


720 

.5085 


1440 

.0000 


1800 

.4789 


2400 

.4910 


10800 
-.02 38 


^11,22'^32,43 


( 3r2^+r2) 


720 

-.0385 


7200 

-.0476 


^11,21' 22,32 


(3r^+r2) 


18 

.2500 


240 

.0000 


900 

•.0333 


11,12' 22,23 


11,22' 23,33 


(3r^+r3) 


( 3rj^+r2) 
2£(l+r. ) (l+r^)] 


18 

.2500 

36 

.1581 


360 

-.0556 

1440 

.0000 


1200 

■.0625 

7200 

'.0282 
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TABLE 4.3,2. Contd, 


Cu-Lg- -ype of Formula of 

crrpts combinations the *P.^s 

of ^ 

' P^'s 


Frequencies and the value c 
the *P^*s 


3x3 4x5 5x6 

table table table 


( 3r^+r2) 


22 

^ll,22'^32/33 

J- ^ 

2[(l+r^)(l+r2)]^^^ 

36 

.1581 

1440 

.0462 

7200 

-.0546 

23 

^1/22"^13,24 

( -i-r„ ) 

iT+qr 

0 

360 

-.1538 

1800 

-.1429 

24 

^11/22'^32,41 

(r^+rj 

ri-i-r^3 

0 

120 

-.2308 

900 

-.1905 

25 

^ll,2l'^12,22 

(r3_+r2) 

TT+i^ 

9 

-.5000 

60 

-.2500 

150 

-.2000 

26 

^ll/12'^2l/22 

( r, -'rr-,) 

TT^f 

9 

-.5000 

60 

-.3333 

150 

-.2500 

27 

^ll,22'^12,2l 

^ ^2‘^~3^ 

( 1 +r^ ) 

9 

-.8000 

60 

-.5385 

150 

-.4286 

28 

^ll,32'^22/33 

(2ri+r2+r3) 

54 

-.2000 

2160 

-.1923 

10800 

-.1667 

^TT+r^l 

( r^+2r2+r3) 

29 

^llr22'^12/23 

13 

-.5000 

360 

-.3462 

1200 

-.2857 

2(1 +r^ ) 

30 

^11,32^^12,21 

( r^+2r3+r2) 

18 

-.5000 

240 

-.3846 

900 

-.3095 

2(i+r^) 

31 

^11,23'^12,22 

(rj^+r2) 

18 

-.3162 

360 

-.1961 

1200 

-.1690 

[(l+r.)(l+r,)]lA 



Ill 



cripts 

of 

'P.-s 

1 


Type of 


Formula of 


combinations the P^'s 


Frequencies and the value 
of the 'p^'s 

3x3 4x5 5x6 

tabl e tabl e tabl e 


HI/22' 23,24 


^11,12" 21/32 


^ll/22'^32/42 


Ul, 13' 12,22 


^ 11 / 31 '^ 21, 22 


^11,23'^12,13 


^11,32' 21/31 


(ri+r2) 




(r^+rs) 


[(l-l-ri)(l+r2)]^/2 - 


13 

,3162 




[ (l-{-ri)(l+r3)]- 


(ri+r2) 


(l+r,) (l+r^)y 


18 

..5000 


( rj+rj) 


[(l+rjXl+rj)]^/^ - 

( r^+2r2+r2) 

2[ (1+r^) (l+r^)]^'^^ - 

( r ^+2 ^ 3+^2 ^ 

2 [ (l+tj^) (l+r^) — 


18 

..5000 


36 

,7906 


720 

.1849 


240 

,2774 


240 

-.2942 


360 

-.2357 


240 

.,3536 


720 

-.4160 


36 480 

,7906 -.4903 


3600 

-.1637 


900 

-.2182 


1800 

-.2254 


1200 

-.1936 


900 

-.2582 


2400 

-.3273 


1800 

-.3662 


H1/12' 13/14 


60 

-.6667 


225 

-.5000 


^11 ,21 '^31/ 41 


15 

- 1 .0000 


90 

-.6667 


Total 


630 


17955 


9 4 395 


112 


TABLE 4.3.3. Different, combinations of v. ^ with shape 

^ 1^1 ^^ 2^2 

parameters and frequency of occurrence for 33i3#4x5 
1 ' 

and 5x6 tables 


Subs- 

Type of 

Formula of 

Prequ< 

OfPj 

ancies and 

the value 

^i 

3x3 

table 

4x5 

table 

5x6 

table 

1 

^11, 12'' ‘^11,21 

(l-r2-r2+r^) 

1 Q 

120 

.6455 

300 

.6124 

JL 

2[(l^C2)(l-r3>]^''* 

Xo 

.7500 

2 

^ll,2l'’^ll,22 

(l-r-i^-r^+r^) 

2[(l-ri)(l-r3)]^'2 

114 

.3536 

240 

.4523 

600 

.4588 


^11,12''^11,22 

(l-rj^-r2+r3) 

*(2) 

.3536 

240 

.3892 

300' 

• 4215 

o 

2[(l-ri)(l-r2)]^/2 

4 


l-rj 

51 

.7071 

320 

.6030 

i 206 

.5735 

^11/32^^22/32 

2[(l-ri)(l-r3)]^^2 

5 

xr 'XT 


^(4) 

.7071 

360 

.5839 

1200 

.5620 

11/23' 22/23 

2[(l-rj^)(l-r2)]^ ^ 



(l-2r2^+r2) 

126 

.0000 

360 

.3182 

1200 

.3684 

D 

^11,22' 11,23 

2Tl-rj^) 

•7 


(l-2rj^+r3) 

* (6) 
.0000 

320 

.2727 

600 

.3421 

/ 

^11/22' 11/32 

2(l-rj^) 

8 

^11, 22^^32/41 

(r3_-r3) 

0 

60 

.4545 

45 b 
.3158 

9 

^11/22'^12/21 

(rj-rj) 

Tx-TJ 

->(6) 

.0000 

60 

.0909 

150 

.0526 


^The values shown in the bottom row are the numerical values of shape 
parameters. The frequency of those P £ values denoted with an asterij 


sign ) indicates that the same value has occured in seme preceedingj 
row (shown within brackets), for some other combination and the 
frequency of it is merged with earlier frequency. 
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TABLE 4.3.3. Contd. 


Subs- 

Type of 

Formula of 

. ^ rs i 


Frequencies and 
of P' 

the value 

c-LUiio Uiits . 


3x3 

table 

4x5 

table 

5x6 

table 

10 

^11#22'^23,31 

2 Trrj^— 


6 

1.0000 

240 

.4091 

1206 

.2895 

11 

^Il/32'^l2y2l 

(r-j^- 2 r 3 +r 2 ) 

■ -urSrlV 


36 

.5 000 

* ( 7 ) 

.2727 

600 

.1842 

12 

^ll,22'^32/43 

(r^-ra) 

2Tl^i[J 


0 

360 

.2273 

3600 

.1579 

13 

^11/22^^23,34 

2 ( 1-rj^ ) 


0 

480 

.1818 

3600 

.1316 

14 




■Jt (4) 

.7071 

* (3) 
.3892 

450 

.2810 

^11/12' 22/31 

[(l-r3^)(l-r2)]^^^ 

1 n: 

‘^11,32'^12/13 

( r^-r^) 


*(2) 

• 35 36 

1080 

.1946 

4800 

.1405 

XD 

2[ (l-r^^) (l-r2)3 

1/2 

13 

^11/23'^21/31 

(rj^-r2) 


*( 2 ) 

.3536 

640 

.1508 

3000 

.1147 

2[(l-ri){l-r3)] 

1/2 

17 

'^11/12'^22#23 

<^ 1 -^ 3 ) 

m-rp 


12 

.2500 

120 

.1667 

406 

.1250 

18 

’^11/21-' ^2 2, 32 

(r^-r^) 

2(l-r3) 


* (17) 
.2500 

80 

.1250 

300 

.1000 

19 

''^ll,32'^22/33 

( rj-rj) 

2 C l~rj^ ) 


*(6) 

.0000 

960 

.0455 

4800 

.0263 

20 

^11/21'^11/31 

1 

2 


*(11) 

•5000 

600 

.5000 

2720 

.5000 
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TABLE 4.3«3, Contd, 


Subs- 

Type of 

Formula of 

Frequencies and 
of P^ 

the value 

cripts combinations 
^i 

the p ^ 

3x3 

table 

4x5 

table 

5x6 

table 

21 

^11/22^^33,44 

0 

0 

3675 

.0000 

25 365 
.0000 

22 

■^11, 12^^12/13 

1 

' 2 

54 

-.5000 

300 

-.5000 

1360 

-.5000 

23 

'^11/23^^22/33 

2Ci-r2^5 

*(6) 

.0000 

480' 

-.0455 

2400 

-.0263 

24 

^ll/22'^3l/43 

2Cl-r^) 

0 

360 

-.2273 

3600 

-.1579 

25 

^11/ 22' ^13,34 

(r3^-r2) 

2(l-r^) 

0 

960 

-.1818 

7200 

-.1316 

26 

'^ll/l2'^2l/23 

( r. - r ) 

■ 2Tr-r'T 

24 

-.2500 

240 

-.1667 

800 

-.1250 

27 

"^11/21' ^13/ 33 

(rj^-r2) 

’ 211^ 

*(25) 

-.2500 

160 

-.1250 

600 

-.1000 

28 

'^11/22'’^22,31 

(1-2 rj^+r^) 

2(iz^y 

* (6) 
.0000 

160 

-.2727 

300 

-.3421 

29 

'^11/22''^31,42 

- ii-qy 

0 

60 

-.4545 

456 

-.3158 

30 

'^11/22'’!^13,24 

(rj^Tj) 

0 

360 

-.3636 

1800 

-.2632 

31 

32 

'^ll/l2'^2l/22 

’^11/21' ’^12/ 22 

-TTT^ 

(rj^-r2) 

*TT=J5r 

*(22) 

-.5000 

’'*^( 22 ) 

-.5000 

60 

-.3333 

60 

-.2500 

150 

-.2500 

150 ■ 
-.2000 
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TABLE 4* 3 *3.. Con td. 


Subs— Type of Formula of 

c riots combinations the P' 


Frequencies and the value 
of P ^ 

3x3 4x5 5x6 

table table table 


^11/22' 13,32 


(2r^-r2-r3) 

7„. 


12 

-1 ,000 


430 

•.4091 


2400 

-.2895 


11/22' 21,32 


(2r3-rjL*"r2) 

— 


* ( 22 ) 
•,5000 


*(28) 

•.2727 


300 

*•#1842 


(2r2-tj^-r3) 


11,22' 12,23 "m-rJ 


-J5- (22) 
-.5000 


360 

'.1364 


1200 

-.1053 


^ 11 , 12 '^ 12,22 


’^ 11 , 12 '’^! 2,21 


-(l-r2-r3+r^) 

2[(l-r2)(l-r3)] 

-(l-r3^-r2+r3) 

2[(l-n)(l-r^)]^/2. 


18 

-.7500 


102 

.3536 


120 

>6455 


240 

•.3892 


300 

-.6124 


300 

-.4215 


^ 11 , 22 *^ 22,32 


'^ 11 , 22 *^ 22,23 


^11,23' 31,32 


'^11,23'^12,22 


'^11,23''^12,13 


^11,32'^21,31 


“(1-^3) 


2[ (l-r3_) (l-r3)]' 


-(l-r2) 


2[(l-rj^) (l-r2)] 


-(ri-r3) 


[(l-r3^)(l-r2)]^/^ • 

-( rj|^~ t2 ) 

-(r3^-r2) 

2[(l-ri)(l-r3)3^/2. 


57 

•.7071 


* ( 38 ) 
-.7071 

*(38) 

-.7071 


*(38) 

•.7071 


*(37) 

*.3536 


*(37) 

-.3536 


160 

■,6030 


360 

•.5839 

* (37) 
■.3892 


360 

,3015 


1080 

.1946 


1280 

■.1508 


600 

-.5735 


1200 

-.5620 


450 

-.2810 


1200 

-.2294 


4800 

-.1405 


6000 

-.1147 


Total 


630 


17955 


94395 
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TABLE 4.4.1, Bounds for e) fiDr( r,c) = (4,5) and (5,6). 


a 

r = 

4, c = 5 

r 

=5, c = 6 

Lower bound 

Upper bound 

Lower bound 

Uoper bound 

0.01 

0,8712 

0.8712 

0.7689 

0.7692 

0.05 

0.3242 

0.3244 

0.7181 

0.7295 

0,10 

0.79 78 

0.7989 

0,6898 

0.6959 

TABLE 

4,2,2. Bounds 

for VqX®) for 

r=: 4 and c = 5 . 



a 

Lower bound 

Upper bound 

0.01 

0.88710 

0,88710 

0.05 

0.84631 

0.84647 

0.10 

0.82362 

0.82443 
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TABLE 't,4.3, 'U-jjC s) values obtained by Monte Carlo method. 


, 

c 



a 





0.005 

0.01 

0^025 

0,05 

0.10 

3 

3 

0.9952 

0.9927 

0.9869 

0,9791 

0.9668 

4 

5 

0,8859 

0,8684 

0.8449 

0.8232 

0.7965 

5 

4 

0.8878 

0.8729 

0.8462 

0.8245 

0.7989 

5 

6 

0.7878 

0.7691 

0.7405 

0,7176 

0.6889 

6 

5 

0.7864 

0.7671 

0.7396 

0.7168 

0.6904 

6 

10 

0.6095 

0.5931 

0,5692 

0.5492 

0.5273 

10 

6 

0.6102 

0.5952 

0,5704 

0.5515 

0.5290 

TABLE 

4: • 4: » 4 * C S } 

values obtained by taking 

the average of 



(r/c) 

and (c,r) values of Table 

4.4.3-L. 









y* 



a 




i 

c 

0.005 

0.01 

0.025 

0.05 

0.10 

3 

3 

0.9952 

(0.9962) 

0.9927 

(0.9940) 

0.9869 

(0.9890) 

0.9791 

(0,9825) 

0.9668 

(0,9722) 

4 

5 

0,8869 

(0,8871) 

0.8706 

(0.8712) 

0.8456 

(0.8465) 

0.8238 

(0.8244) 

0.7977 

(0.7989) 

5 

6 

0.7871 

(0.7871) 

0.7681 

(0.7692) 

0.7400 

(0.7428) 

0.7172 

(0.7205) 

0.6896 

(0.6959) 

6 

10 

0.6098 

(0.6141) 

0.5941 

(0.5982) 

0.5698 

(0.5758) 

0.5503 

(0.5578) 

0.528i 

(0.5385) 


1 

The values shovm in parenthesis are the nominal percentile 
points obtained in Section 2.3. 
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T^iBLE 4,-i.5. values obtained by Monte Carlo method* 


a 




0.01 

0.02 

0.05 

0.10 

0.20 

3 

3 

0.9951 

0.9924 

0.0359 

0.9767 

0.9628 

4 

5 

0,8388 

0.3722 

0.8465 

0.8229 

0.7954 

5 

4 

0.8857 

0.8691 

0,8443 

0.8219 

0.7943 

5 

6 

0.7880 

0.7674 

0,7409 

0.7156 

0.6872 

6 

5 

0.7823 

0.7656 

0,7406 

0.7170 

0.6878 

6 

10 

0.6116 

0.5959 

0.5708 

0.5507 

0.5266 

10 

6 

0.6123 

0.5953 

0.5712 

0,5496 

0.5 2 40 


TABLE 4,4.6. values obtained by ta'king the average of 

(r*c) and (c,r) values from Table 4,4.5^ .' 




0.01 

0.02 

0.05 

0,10 

0.20 

3 

3 

0.9951 

0.9024 

0.9859 

0.9767 

0.9628 



(0.9962) 

(0.9940) 

(0.9890) 

(0.9825) 

(0.9722) 

4 

5 

0.8872 

0.8707 

0.8454 

0.3224 

0.7948 



(0.8871) 

(0.8712) 

(0.84-65) 

(0.8244) 

(0.7989) 

5 

6 

0,7851 

0.7665 

0,7408 

0.7163 

0,6875 



(0.7871) 

(0.7692) 

(0.7428) 

(0.7205) 

(0.6959) 

6 

10 

0.6122 

0.5956 

0.5710 

0.5501 

0.5257 



(0.6141) 

(0.5982) 

(0.5758) 

(0,5578) 

(0.5385) 


1 

The values shown in parenthesis are the nominal percentile 
points obtained in Section 2.3* 



CHAPTER V 


PERFORMANCE OF OHE STATISTICS 
5.1. Introduct.ion 

In this chapter we will study the performance of the test 
statistics proposed in Section 2.1 in the non- null situation 
when two outliers are present. Similar to Chapter IV/ we 
again consider the case ‘^ = O only. The distriiution theory 

•f 

results for 2 ^ > 0 are exactly analogous with minor changes 

2 2 

at few places. Consequently/ we shall use S in place of S • 

hr 

The statistic u^j of equation (2.1.1) then reduces to 

(5.1.1) Uj,j = + ejAj/^)/[S{2(l+Pj^j)}^^^3 • 

Similar ejqjression holds for Y-^*Y 2 * • • • ^ 

independent and normally distributed observations. We now 

asstome that exactly two of these observations are outliers. 

The null hypothesis is that there is no outlier/ and under 

the model is given by equation (1,2.1)/ viz. for i = !/'>/..,/( 

m 

E(Y.) = E X. . 0 . = jJ-Af say/ 

J- j _i i3 j X ^ 

Var (Y^) =0^. . ' 

To evaluate the performance of the statistic U we use a 
one-sided alternative hypothesis/ and for V a two-sided altemati] 
hypothesis is used. For one-sided hypothesis we assume that 
two observations/ we do not know which oneS/ have a mean shifted i 
to the right. Ihe alternative hypothesis is then the union of 
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^ 2 ^ hypotlieses Cl 5, i j £ n)y v^ere under we have 


ECYg) = 


if S ^ iyjy 

ja^ + if s = i, 

Mj + ©j if s = j. 


i. 


where 6* and ©^ are greater than zero. Other assimptions 

i J 

regarding the variance and distribution of * * * '"^n 


unchanged. Consequently^ under H 


13 


(5.1.2) 
where 

(5.1.3) 


Y i NC.yU^ 


E(Y) = X 3 + e. e . + 0 e _ 
^ «v. rJ J 


e . (i = l,2y.../n) is the ith column of !_/ the identity matrix 
of order n. 


Similarly/ for two-sided hypothesis/ we assume that out 
of two observations/ one observation has a mean shifted to 
left while the other one has a mean shifted to right. We 
discuss the distribution theory of u^j and v^j etc. under the 
alternative hypothesis in next two sections. Measures of 
performance of these test statistics U and V are studied in 
Section 5.4. Finally^ we compare our procedure with that of 
sequential procedure suggested by Anscombe (1960)/ Moran and 
McSdillan (1973)/ and John and Draper (1978)/ etc. 


5.2. Non- null distribution of u^ . 

For notational convenience we shall denote the random 
vector Y by y etc. From equation (1.2.2)/ the residual vector 
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is e = A y* Further/ as stated in Section 1.2/ the residual 




vector e has a singular normal distribution ISl(0,Aa’^) under 

ro ^ 

We now obtain the distribution of e and residual sum of 
2 

Squares S under H, 

ij 

Clearly/ under H. ./ e is normally distrituted with 

ij 

2 

variance-covariance matrix A a and mean 

fO 


E(elH. .) = A E(y I H. .) 

to XJ ro XJ 

= A(x 0 + e. 0. + e . e^) 

to JL X CO J J 


= A(ej_ 0j_ + e^. e .)/ 




since A X = 0. 


2 2 2 

The residual sum of squares S has a non-central o X 
distribution with n-k degrees of freedom and non- centrality 
parameter \*/ where 

= E(y'lK^j) A E(^lH^j) 

= O' X' + e'. 0 . + e*. 0 ) A (x 0 + s. e • + e . © .) 

CO CO co-X X toj J to to to coX X COJ J 

Si ®i ®j + S.U i/iSj+ij &i± Si Sj 

(5.2.1) = e| + Xjj e] + 2 Sj . 

We denote this as S^ S cr^X^(n-k/X^) . 

5'or the sake of convenience we consider ^2* derive 

the distrifcution of u. . under H.,., For thiS/ we need the 

ij 12 

following properties of A • 

<0 

Property ( i) ; a is a syiranetric and idempotent matrix of 

|0 

rank ( n-k) , 
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Property (ii) 

(5.2.2) 


h. 1 ) 

Ui) 


denote the 1th coliomn of 


A / then 


Proof ; We have 


ti2) ••• !:;(n)] = 

L2^(n)J 

Since A is symmetric and idempotent/ hence A * A = A , 

(S» iN» ro 

and the result follows on actuating the (i/j)^ elonent on both 
sides. 


A = 

CO 


^( 1 ) 

li2) 


Property ( ili) * ^ ^ " 2* 

This is obvious / since AX = 0, The distribution of u . . 
under H ^2 given in Theorem 5.2.1. Its proof requires the 
following lemma. 

Lemma 5.2.1 . Let 

(5.2.3) tj^j = ^ ej/X^/2)/[2(l+Pj^j)]^/^^ 

9 2 2 

Qj_ = tij and Q 2 = S -t^j. Then under ^ 2 ' 

2 

(i) t. . is distributed as N(ja,a ), vdiere 

(5.2.4) u. = [(Pii+Pjj) ei+'P2i+P2j’'^22^ ej /[aCl+P^j)]^''®. 

d. r 2\ 

This will be denoted ty t . . = N(M/d ) . 


(ii) Qj^ 2 crV(1^6) and = aV{nr-'k--l,ri) , 



where X^(a/\) denotes a non-central distribution with 'a' 

2 2 - 

degrees of freedom and non-centrality parameter X , o 6 ■= fl anc 
(5,2.5) = [1/{2(1+P^j)} ] [ £2(1 +P^j)-(P3^^+P3^j)^} ef 


t C2Ci+Pij)-(P2 X 22 


1 

>2- 


+ 2{2X3^2^^'^Pij^'"^Pli’^Pl ^^2i'*'^2j^ ^^^11^22^ ^®1®2- 

( iii) tjj^ j and are independent. 

Proof : Since e^ = (^iCi) 2^' hence from equation (5,2.3) we have 


[2(1+P 


1 r^(i) , ^j)n .. r-' ” 

“7072 L ;I72 + J l = Z Z 
ij J Oi jj 


where 


c' = 


[2(1+P.j)] 


■77172 ^ • 


^ %**•/ < 


Note that 


c c = 

<V> <N> 


1 [ ^(i)^i) + ^^'i)^(i) ^ Milkil] 

^ ' <7777^ 


2Tr+^L 


= 2a%<2 

on using equation (5.2.2), Consequently, 


(5.2.6) 


c' c = 1, 

f\» ro 


Since t . ■ is a linear combination of y and under „ 
1 j X z 
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hence under ^^2' normally distributed with mean and 

variance^ respectively^ given by 


iU = E(t. .) = C'' E(y) = c-" (X 0 + e. + s e,) 

jL_j fio cNJ jpo fo c«J foX X fsj ^ Zi 


= c'(e^ e. + e e„) 

<N> ojX X Z Z 

[2(i+pT^)p^ 


[ 2 ( 1 +P^j)] 




XV. ■ 
11 jj 


^ii 


Var (t^j) = c' c = Cj 2 , 
on using equation ( 5 . 2 * 6 ) • 

Next, 0 , = t? . = y' c 

X XJ 

Hence Qj^ = a^^(l, 6 ), 
is given by 

0 ^ 6 = [E(y)]' c c' [E(y)] 

<\j <s> ro 

= [E(t^j)]^ = 

Further — t^j 


c' y* 

<s> CO 

where the non— central ity parameter 


= y' A y - y' c c' y 

CO CO ro #0 <0 CO fO 

= y' [ A - c c'] y. 

CO fO <0 CO «o 

Thus Q2 is also a quadratic form in y* The matrix of 
the quadratic form is A - c c' , which satisfies 

^ CO CO fO 
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(a - c c' 


= (A 

- c 

c' 

) 

( 

A - 

c 

c' 

) 

CO 

c- 

c 



CO 

CO 

CO 


II 

>K> 

A 

c 

c' 


- G 


' A 

+ c c' c c' 

CO 

c.' 

CO 

co 


CO 

CO 

c- 

CO CO CO CO 

= A - 

A c 

c 

/ 


c c 

/ 

A 

+ c c' 

‘CO 

CO CO 

CO 


CO CO 

CO 

CO CO 


on using equation (5.2.6), Purther, 


A c = 

(TJ <s> 


X 

ro 


( 1 ) 


^(2) 


i.ln) 


1 rh±) . ^D-j 


+ Xj/xl/2 


[2(1 + Pj_j)] 


172 


2y ''jj 




IJ 11 JJ 


ni' XI 


i) , ^ ( 1 ) j 


= c. 

CO 


2 

This implies that c' A = c' and (A-cc') = A - c c* . 

cocnJ CO rocoro CO csoco 

A - c c' is an idempotent matrix of rank given by 


fO CO CO 


rank (A - c c') = tr (A - c c') 

CO CO CO CO CO ro 

= tr A - tr ( c c') 

CO ro CO 

= rank (A ) - tr ( c' c) 

CO CO ro 

= ( n-k) - c 


Thus 


n-k~l • 
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Consequently, 


= a^X^(n-k-l,n), 


where 


a^n = [E(y)]" [A - c c'] [sCy)] 


ro <Ni 


= E(yO A E(y) - E(y') c c' E(y) 


<sj CO CJ 


CO CO CO fO 


2 2 
= 0 X* - 0 6 r 

where X-^f- is given at equation (5 •2.1) with i = 1 and j = 2/ and 
Hence 


2 2 
0 6 = M 


a^T) = + 2\^ 


- [l/£2(l + P^j)!] ^ie?+(p2r'’2j’ \2®2 

+ 2(Pir%j)(PjrP2?‘hlh2>"''' ®1®25 

= [1/{2(1+P^j))] [{2(i+p^j)-(Pj^+p^j)^!\jjeJ 

+ f2(l+P„) - (p2i + Pjj)") ^2 02 

+ * '2i''''’2j' ®1®2 ^ 

Finally, t. . = c' y and cl = y'{A - c c')y are independent, 

Zroroooro^ 

since 

(a - c c')c = Ac-cc' c = c- c=0. 

«v> <N}CV>4N> <v3C^> <S><S><v 3 €Si CJ 

liiis completes the proof of the lonma. 

We next obtain a general distribution from which the 

distrilxition of u, . under H can be obtained immediately, 

2.3 12 
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Theorem 5»2.1 « Let T S N(jLt,l) and Q = X^(a^7l). If T and Q 
are independently distributed, then 

( 5 Z = T/(T^ + 


has a pdf given by 

(5.2.8) f(z) 
where 


oo oo 

E K . S K 


z^(l - 22jj+V?-l 


j=0 ^ i=0 ^ B[(i+l)/2, j+a/2] 


-1 < z < 1, 


(5.2.9) K. = (J)V3’! / j =0,1,2,..., 


and 


(5.2.10) kJ = e”^ -^2 ^/G( i/2+1) , i = 0,1,2,... . 

Proof : Since T = N(/_!,,1)^Q = X^(a, rj) and T and Q are independent, 
hence the joint pdf of T and Q is 

f(t,Q) = g-(t-^)V2 


(2n)- 


e-^/2 - 


j=0 j! G( j+a/2) ^ 


(5)3 e-Q^2 Qj+a/a-l^ 


= S K. 


e 


-M^/2 


j=0 ^ (2n)^/2 2J'*’^/2 q( j+a/2) 

where Kj is given at equation (5.2.9). 
Now making a transformation 
z = t/(t^+Q)^^^, and 


O < Q < oOf — oo < t 


g-(t -2jLit+Q)/2Qj+a/2-l r 


Q = Q, -1 < z < 1, 
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the inverse transformation is 
t = z 

Q - Q. 

The Jacobian of transformation is given by 


9( t,Q) 

3 ( 2 / 0 )' 


q1/2/(3^_22)3/2 z/[2{ 0(1-2^)}^'^^] 


O 


= q"/V(1-z2)V2^ 


and the joint pdf of Z and Q is given by 


( 5 . 2 . 11 ) f(z,Q) = E K. 


-IU-V 2 


j=0 G(j+a/2) 

,-{ z^(y(l-z^)-2/l zQ^^VCl-z^)^'^^ + QJ/2 

. qJ+(^-1V2 (i«^25-3/2 


00 

E K. 


-mV2 


« e 


j =0 ^ (23T)^/^ G(j+a/2) 

-{ Q/( 1-z^ ) -2jLlzQ^^V( } /2 qJ +( a-1 )/2 ( 


Make a transformation x = 


O 


1/2 


0 < □<<»/ —1 5 .^ 5 .!' 


, z = z; that is f 


Q = x‘‘(l-z‘‘)^ z = z. 

The Jacobian of transformation is | S , | = 2x{l-z^), 


.2j-3/2 


and the Joint distribution of Z and X becomes 
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f ( z,x) = E K. 


-mV2 


j=0 ^ (271)^/^ G(j+a/2) 


-(x-24zx)/2 ^2j+a-l J+^^-l>/2(l-22)-3/2 2 x(l- 22 ) 


-/J -^/2 


= S K. 


j=0 G(j+a/2) 


. x23+^ -2^2x)/2 , -1 < 2 < 1. 0 < X < ». 

Integrating x from 0 to <», 

~ e“^^/2 2.j+a/2-l 

(5.2.12) f(z) = E K- - T 

j=0 J (2n)^/2 2J+a/2»l G(j+a/2) 


. /“ ^ 2 j+a ^zx 


Now, consider the integral 

l( z) = / e ^ /2 ^2j+a qM-xz appearing in equation (5.2.12) 
o 

This integral is convergent. Expanding ejqj (juxz) in powers of 
jixZf we get 

CO oo , , 2 

I(z) = / E {(jj,zx)Vin X e dx 

o i=0 

= I (AVl!) r ^2tJW2+(l+l)/2!-l ^-x 2/2 ax, -1 < ^ < 1. 

i=0 o 


Letting 


y = x‘^/2^ that is x = (2 y)^^^/ we have 


xdx = dy, and 
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ICz) , 2JW2+(i+l)/2-l s CuV/i!)/" dy 

i=0 o 

^ 2J+^2+(i+ 1)/2-1 ^ (jLi^zVii) G£J+a/2+(i+l)/2}. 

i=0 

Substituting for l(z) in equation (5.2.12) we get 


f ( z) 


OO 


E 

j=0 




e-W-V 2 (i. 2 , J+a/2-l 
^l7T"^j+a72-l/2 


j+a/2+i/2-l/2 

■fe.— — I. 

G ( j +a/2 ) 


• S (if- ' z */ i! ) G { j +a/ 2 +( i+1 ) /2 } 
i=0 

2 

= £ K . E e"^' 2^^^ Gf i+a/2+(l+l)/2? Gf(i+1)^2} 
j=0 ^ i=0 il G(j+a/2) G{(i+l)/2} 

. 4^z^(1-22)J+V2-1 

^ 2 K. E i±u :2j .jmj 

j =0 ^ i =0 ilG(l/ 2 ) B[(i+l)/2,j+a/2] 

= E K. E k;^ z^(l-2^)^”^^'^“VB[(i+l)/2, j+a/2]A-l<z<l, 
j=0 i=0 

where 

2^^'^ G{(i+l)/2} fJi^ 

K* = i = 0,l/2/..* • 

^ ±!G(1/2) 

Next, writing i* =G(i+l) and applying the duplication formula 
G(2 x) = (2n)“^/2 2 ^^”^^^g(x) G(x+1/2)/ 
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we get 

i? = 2^G{(i+l)/2} G(i/2+l) - 

Cons eqaently , 

Kf = 2~^^^ G{(i4-l)/2? 

^ G( i/2+1) G{(i+l)/2} 


This completes the proof of the theorem, 

2 

Corollary 5«2*1 , Let = Z , Then the pdf of Z^ is given by 

CO oo )3+^/2~l 

(5.2.13) f(z. ) = 2 K. 2 K*. -i i ;0<z. <1 

j=0 J i=0 21 B(i+l/2d+a/2) " 

Proof i From Theoranti 5,2.1/ we have 

fCz) = 2 K. 2 1^5 z ^(1 -.z^)^'*'^/2-1/bJ-( 4 j+a/2] , 

1=0 ^ i,=0 H ^ 


Making a transformation 2 ^ = z^z the inverse transformations 

are 

_ ^ 1/2 _ 1/2 

z — z,| 3Lrict z z^ • 

Hence the jacobian of transformation l-|^l is l/(2z^‘'^^) for 

azi 1 

both the cases. Consequently/ 

f(z^) = f( z)/(2z^'^^) I ^^2 f( z)/(2z^^^) I ^^2 

Z— 2 .i ' m""* ^ 7 . . 


Z=-Z„ 


= 2 E K. 2 kJ (1/2) 
j=0 i^=0 d 


( i — 1 ) /2 

. Zj^ ^ (l-Zj^)^'*^/^*"VB[(i^+l)/2/ j+a/2] f (ij^ even) 
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( even) 


= E K. S K 

j=o ^ ±^=o n 


Writing i^^ = 2i, we simply get 


f ( z, ) = E K . S k;-' (i « 2 ) j +a/2 -1/3 ( , j +a/2 ) . 

j =0 i =0 ^ 


Note that 


K. 


■2i 


= e”^/^ ()i V 2 ) Vg( i+ 1 ) , i = 0 , 1 , 2 /... 




{1x^/2) Vi ! • 


Thus the marginal distrihution of is a linear combination 
of beta variables v/ith weights given by Poisson probability terms. 


Now, we apply the results of Theorem 5.2.1 to get the pdf 


of U.J. 


From equations(5 .1 .1) and (5.2.3), we have 


Ui- = t./S = t./(t^i. t Q 2 ) 


1/2 


t. ./a 

ij 

.2 ,,2 . V /^ 2 v 1/2 • 


By Lemma 5.2.1, we see that under ^2* ^ N(jU./a,l) and 

<3. 2 2 

□2 = d X (n-k-l , 77 ) , jLi and V are defined by eguations (5*2.4) 
and ( 5 . 2 . 5 ) respectively. 

Further t^j and Q 2 are independent. The conditions of 
Theorem 5.2.1 are thus satisfied. Hence we have the following 



theoran for M> o (with obvious changes for M < O) , 
T heorem 5«2«2 » The non-null distribution of u^j under H. 
given by 


is 


(5.2.14) 

oo oa ^ i j +3./ 2 ""I 

= S K Z (u..) ^(l“u|.) ^ ' /B[(x,+l)/2, jj+a/2], 

j^=0 h i^=0 ^1 J- -L 

-1 < u^^ < 1, 

where (for M > O) 

K. = e“^^ CV2)^Vj.! J j-j =0,1/2..... 

V — a/2 ^/2 

= e (6/2) /G(i 3_/2 +1); i^ = 0,1,2,..., 

2 2 

a = n-k-1, 6 =/-'- /o ; ii and V are as defined in 

equations (5.2.4) and (5.2.5) respectively, \ 

Note that 0^ ~ ®2 ~ ^ implies T) - p, = O and ~ ^ = 1# 
while other K. and are zero. Also a = n-k-1 = p-1, Pr®m 

•^1 H § * 

equation (5.2.14) we immediately get the pdf or u^^ under the 
null hypothesis as 

= (1-u1)'P'^'/Ve[1/2,(p-1)/2] , -1 < U.J <1, 

which is same as the null distribution obtained in equation 

( 2 . 2 . 11 ) . 


For studying the performance of our statistic, we need 
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The probability can be obtained from Corollary 5»2.2. 

Corollary 5«2«2. For > O, 


(5.2.15) Pr(u^j > 


^ E K. S K.'* I 

3=0 J 1=0 ^ 1-v.l 


[j+a/2/ ( i+l)/2 ] , 


where i and j are used for ij^ and respectively and u is used 
for variable of integration for convenience. 

Proof : Pr(u^j > 

OO OO 1 • • J 

= E K. S KV' / u^d-u^ )-^'‘’^'^'"^du/B[( i+l)/2, j+a/2] . 
j=0 ^ i=0 ” 


U 


a 


Putting y = dy = 2u du^ we get 


Pr (u^j > u^IH^2^ 

1 “ -J5- 1 

= y S K. E K. / y 

j=0 ^ i=0 B[(i+l)/2/ j+a/2] „2 


/ (i+l)/2-l^.y,j+V2-l 


dy 


a 


2 K,. 2 kJ I ^ [j+a/2/(i+l)/2 ] 


2 1 
j=0 i=0 


1-u 


a 


on substituting z = 1-y in the integral. 
This proves the corollary. 

Note that 

Pr(u . > OlH ) = Pr(t > 0.H^2> 


= f&i/oj 
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Consequently, this probability is an increasing function 
of /i. Thus for large values of fi/o, the distribution of u^^ 
is essentially confined to the interval [O/l] / for exairpls/ 
for M'/a = 3, we have Pr(u^j > 011^2^2^ = 0,99S65« This is useful 
for evaluating an approximate expression for PrCu^^ > 
for large values of ;.l/c. 


5,3. Non-null distribution of v. . 

. ..I - - I ,, Al 

The statistic V is useful, when there are two outliers, 
one on either extreme. So we consider the alternative hypothesis 
model under this set up with one observation having a mean 
greater than the one under and another observation having 
a mean less than the one specified under Thus under 


H’Jj/ i ^ j/ 


E(v) = X 3 - e . e. + e . e 

fs> <N>cs> cw>J J 


where e\ (i = l,2/.../n) is the ith column of the identity 
matrix I„ and 6^, 6- > O, 

1 j 

In this case the alternative hypothesis is the union of 
n(n-l) hypotheses H‘: . (i, j = l,2,...,n). Under e is 

ij ij CO 

1/^j 

2 

normally distributed with variance-covariance matrix A O and 
mean 

ECelHj.) = A E(ylH‘5‘.) 

ro co CO 

= A (X 3 - e. 0. + 6 . 0.) 

CO fO ts» CO J O 


= ^ ‘Sj - ti ®i’ 


since AX = 0, 

CO CO CO 
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2 2 2 

The residual sum of squares S has a non-central d a 
distribution with n-k degrees of freedom and non-centrality 
parameter V''"”' / where 


a 


E(y'lH;.'.) A E(yiHf.) 


= ( 3 ' x' - e' e. + s', e.) A (X 3 

oo 1 roj j <s> fo 


ro oo 


e . 8 . + s . 0 .) 

<v.x ^ ,y3 3' 


(5.3.1) =x.i 0^ - 2 e^. 


Again for the sake of convenience we consider sind 
derive the exact distribution of v^j under ^'^ 2 * ^or this we 
use the properties of A mentioned in Section 5*2* Its proof 

ro 

requires the following lanma* 


Lemma 5«3«1« Let 


(5.3.2) tjj = (epx|f _ eyx^/2)/[2(l-P^J)]l/^ 


and Q 2 = — ^ij” under ^[* 2 / 

( i) ty is distributed as N(jll*,CT'^) z where 

(5.3.3) 11 - [ ^Pij'-Pxi) 23''^2±^ ^22 ^ 

(ii) i and Q 2 = ,V^) , 

0 0 
where X^^Ca^k) denotes a non-central X"^ distribution with 'a' 

2 ■«• 2 

degrees of freedom and noncentrality parameter k # o and 



(5.3.4) e 

+ £2(l-Pij)-(P2j-Pj^)2j X2je|-2{2K^j(l-P^j> 

- ®1®2] 


( iii) and are independent, and hence and Q 2 independents 

Proof 5 Since “ ^(i) hence from equation (5.3.2)# we 

have 

1 x;.^ xf.v 

+.^5- 1 r c^( 1 ) _ M l) n „ _ ^# 

[2(1-Pj_j) xK^ Xl'^^ 

where 

1 


11 


32 


d' = 


X ? X5 .K 

r-^LLil - ^ 1 


[2(l-.p.j)f/2 j^l/2 ^ 


Note that 


d'd 

CO <NJ 


1 _ 

211-P. 


IJ 



2(l-P^j) 

on using equation (5.2.2). 


^^(i) ^i) 



Cons equently# 


(5.3.5) d' d = 1. 

fO 

Since ti . is a linear combination of y and under Hf, 

X J ro X - 

y i N(x 3 - + e^e # 0 ^ i), 

CO €v> ro fo'^ X ^ Z Z CO 

hence tjj is normally distributed under ^2 mean and 

variance given by 


(N H 
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= E(tJ.) = d' E(y) = d'(X j3 - 6.01 + e.e,) 

IJ ^ ^ CO ^ col 1 CO 2 2 


[2(1-P.,.)] 


= a'(s,e, - 6 , 9 , ) 

CO coZ Z coi X 

r<^( i)»^2 2 cj( j.) 1 2 , <^ ( 1 ) <^1 li 

172 ‘•■■7172 072 .1/2 “W^-’ 


® 2 ] 

n-i n*i n-t 


[2(1-P..)] 


xV/ xV. 

32 11 


= y\{\y{ 2 u- . 


Var (tj.) = d' d = 

1 J CO ro 

on using equation (5.3.5). 

Next, Ch = €-f. = y' d d' y. 

“*“J CO CO CO «o 


d O ' 34 ' 

Hence =X (1,6 ), where the non-centrality parameter 


is given by 

a^6* = [E(y)]' d d' lE(y)] 

<N> CO CO cv> 

Further Q, = 

2 ij 


= y A Y ~ y d d' y 

CO CO rO CO CO CO f>0 

= y' [ A - d d'] y. 

Thus Q 2 is also a guadratic form in y. 

The matrix of the quadratic form is A - d d', which satisfies 


CO to fO 

(A - d d')^ = A^ - A d d' + d d' d d' - d d'A 

CO coco CO cocoro tococoto rococo 

= A-Add'+dd'-d d'A 

CO coroco coco locoto 
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on using equation (5*3,5) • Further 


X 


( 1 ) 


A d = 


^(2) 




1 r ^(i) . ^(1) -I 


[sCi-Pij)] 


172 




X ./>^yp‘ - X ./h}-/? 

nr XI ny jj 


i ^ [~l4i _ ^i-U] 


= d, 

fS> 

Consequently (A — dd*^)^ sA-dd*^ / that is A 

^ <Orocv> fO 

an idempotent matrix of rank given by 
rank (A-dd') =tr(A-dd') 

|S> «0 (S> rO CO CO 

= tr A - tr ( d d' ) 


CO CO 


= rank (a) - tr (d' d) 

CO ro CO 

= n-k-d'd = n-k-1. 


Thus 

where 


□2 i a^X^( n-k-1, 71*) , 


= [E(y)]' [a - <1 <^'3 [E(y)] 

CO CO fO CO fO 

= E(y') A E{y) - E(y') d d' E(y) 

Jo CO ro CO CO co fO 


(p- x-:h5- - 6* , 


tix • 
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where is 


given in equation (5»3,l) with i = 1 and j = 2# 
and 6* Hence 


0^ rj* 


^22®2 “ ^ H2®1®2 


- [1/{2C1-P^j)!] ICPi j-Pii)" N2®2 

- 2(Pij-Pli) <P2:-P2l''hl^22>''''' 8 iS2] 

= [1/!2(1-P.^.)J] [{2(l-Pij)-(Plj-Pl^)2i 

+ £2(1-Py) - CPjj-Pji)^! XzzSz 

- 2f2 Xi2(l-Pij)-(Pi.j-Pii)CP2j-P2i)(Xi^^22)"/2!9iej] . 


Finally, tV. = d' y and Q_ =y''(A— dd') y are independent, 

cvj fsj ^ 4 >> <sa CO fs> CO 


since 


(A-dd') d = Ad-dd' d = d- d = 0. 

cs> CO CO cv' ro CO CO <0 co CO CO co 

. 2 

Consequently = t'£j and Q 2 are also independent. 

This completes the proof of the Isnma, 

Now using these values of M** = and , and 

Theorem 5.2,1, we get the exact non— null distribution of v. 


xj 


under H !^2 


i. o Ji +a/2-l 

00 00 

(5.3.6) f(v. .) = S K. E kJ — ^ 


ij' 


Jl=o ii=° 


.ii 

^1 B[(ij_+l)/2, j^+a/2] 

-1 < v^. < 1, 
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where 

(5.3.7) 

(5.3.8) 


Kj^ = ('nV2)^Vji! / ji = 0,1,2/.../ 

kJ = (6'V2) /G(i2^/2+l) / =0,1,2/ 


/ 


and a = n-h-l • 


For stxidying the performance of the statistic V, we are 
interested in obtaining (for > O) 


Pr(V > v^iH.%) = Pr ( Max iv. .1 > v„|H* ) 


l<i<j<n 


Pr( 1 v^j I 




= S Fr(v^. 
l<i<j<n 




2 

Hence it is sufficient to study the distribution of Vj^j for 

obtaining a measure of performance of V, The following 

2 

theoran gives an approximate distribution of This 

is obtained by using Patnaik^’s (1949) approximation for non- 
2 

central X distribution. 


Theorem 5.3.1. Let and be independently distributed as 
2 2 

X (a^^/Xj^) and X ( 32 /^ 2 ) respectively. Then the pdf of 

(5.3.9) Z = Z^/(Z3^ + Z 2 ) 

can be approximated by 



142 


(5.3,10) f{z) = 


f./2 U/2 f-,/2-1 fV2-l 

^1 S 


B ( f j ^/2 , £3/2 ) [ C3_+{ c^-c^ ) z 3 


r^Tf^TTT 


O <_ z 5 


where for i = 1,2# 


(5.3.11) = (a^ + 2Xi)/(aj_ + \j_) and 

(5.3.12) fjL = (a^ + Aj_)V(aj^ + 2?^^). 


Proof 1 Since i X^(a,.»X,-) , (i = 1,2)/ hence each Z^ can be 
approximated by Patnaik's approximation, for example, see Johnson 
and Kotz (1970, p.l98) , as = Cj_ Y^, where Y^^ is distributed 
as a central with degrees of freedom, and c^ and f^ are 

defined in equations (5.3.11) and (5.3.12) respectively. 

The pdf of Y^ is given by 


-y ./2 f,/ 2— 1 f ,/2 

f(yi) = e y^"" /[2 "" G(f^/2)] , O < y^ < «, i =1,2 


Hence the pdf of Z^ is 

-Zi/( 2 Ci) (f i/ 2 - 1 ) 


f ( z^) = e 


^i 


f-r/2 

/[(2Ci) ^ G(fi/2)] 


0 < < °°f i=l, 2 - 


Since Z^ and are independent, hence the joint pdf of 
Z^ and Z 2 is given by 

-{Zi/( 2 Ci)+Z 2 /( 2 C 2 )} %/ 2 -l £ 3 / 2 -! 

e z^ 

f(zi,Z2) = 


( f . + f , )/2 e |72 f ,/2 ^ 77 ' 

1 ^ ^ c^^ c^^ G(fi/2) G{f2/2) 


Now making a transformation 
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z = z^/(z^+Z2) and x = z^+Z2/ 

0< z^l/ 0< 

the inverse transformation is given by 
= zx, Z2 = x(l-z). 

a ( / Z 2 ^ 

The Jacobian of transformation Ir-r-— r i = x* 


Hence the joint pdf of Z and X is given by 
f ( z,x) 

f^/2-1 ^2^2-1 -(z/(2Cj^)+(l^z)/(2c^)}x (f^+f^)/2-‘l 

z ( l~z) e X 

(f.+f.)/2 fTZa 02 

2 ^ C^-' C2^ G(fj_/2) GCf^/2) 


f ( z) 


Integrating x from 0 to «> , 


f-,/2-1 f«/2-l 

G[(f 3 ^+f 2 )/ 2 ] z ^ (1-z) 


_-_7_ ___ (f +f 5/2 

^ ^ G(f^/2) G(f2/2)Cj^ C2^ [z/(2c^>i-(l-z)/(2c2)] 


f,/2 fy2 fy2-'l f-5/2-l 

C 2 ^ z (1-z) ^ 


r%+fp72 ' ° i ^ i 


B(f^/ 2 /f 2 / 2 ) [c^+( C 2 -Cj^) z] 

This completes the proof of the theorsn. 

Note that f( z) satisfies the properties of a density 

1 

function, that is (i) f(z) > O and ( ii) / f(z) = !• This 

” o 

can be verified as follows* 

Since <^'^2 > 0 and O < z < 1, hence f ( z) > 0* 

Now, to check the second property, let 
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2l = z/lc^ + (C2 - 0^)2} . 
Then = Gj^ (1-2 )/[c2^+( C 2 “Cj^) z] 

dz 2 

*"'3 Sr = CiVC°l+<=2-'=l>^] • 

Hence 

1 

/ fCz)d2 


o 

- 1 , 
" Br^727%72T ^ 


1 , 

" B(f^/2,:^/2) ^ 
= 1/ 


f^/2~l fo/2-1 

(C2Z) [c^(l-z)] dz' 

f /2 

[c^+(c2-Ci) 2 ] [c^+(c2-c^)z] 

fy2-l f,/2-l 

2^ (l- 2 ]_) dZj^ 


that is, f ( z) is indeed a density function. 

The results of Theorem 5.3.1 can now be applied for 

2 

obtaining an approximate pdf of v^j . 

Similar to the expression of in equation (5.1.1)/ the 

statistic V. . of equation (2.1.1) also reduces to 


(5.3.13) = (eiA^{2-ejAj^^)/[S£2(l-Pj.j) 


t* ./S, 
X3' * 


on using equation (5.3.2). Hence 



t^^/s^ = t!fV(f^i + Qo) 

□l/d^ 

(Q^/a^ + 
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By Lemma 5*3.1, we see that under ^ X^(l,6*) and 

i a^X^Cn-k-l,??-'''-), where 6 * = M* and m areas 

defined by equations (5.3.3) and (5.3.4) respectively. 


Further Qj^ and are independent. The conditions of the 
Theorem 5.3.1 are satisfied if we take a^^ = 1, a 2 = n-k-l, = 6 ’'^ 
and = 7)^ , Hence we have the following theorem. 

9 

Theorem 5.3.2. An approximate non- null pdf of v^ = vt j under 
1^2 given by 


(5.3.14) f(v^'-) 

f./2 f ./2 


f,/ 2 -l 

( 1 -v*) ^ 


B(f^/2,f2/2) 


[ci+(c2-Ci)v5^] 


where c^ = (a^^ + 2 Xj^)/(a^ + X^) / % = (a^+X^) / (a ^+2 ^j^) ? i=l, 2 ; 


ai ~ ^2 “ ^ X ” X 2 = ^ * 

Note that ^ iir^lies that 5 * = Tj* =0, Consequently 

Cj^ = C 2 = 1 and fj^ = ^ =1# ^2 “ n-k-l = p-1. This 

reduces equation (5.3.14) to 


f(v*) _ 1 ^-1/2 o < -^ < 15 

B [l/2,(p-l)/2] 

2 

which agrees with the exact null distribution of v* = v^j 
obtained from equation (2.2.11). Thus in the null case, the 
approximate distribution given here coincides with the exact 
dis tribution , 
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Corollary 5.3.1 , The probability 


= pr ( iv _| > 

(5.3.15) Sl-i^(f^/2, f2/2)r 

where 

^ = = 2 '' a /[ a +'= 2 -= l ) '"al 

Ci = ( aj ^+ 2 X ^)/{ a ^+> j ^) 
fi = (ai + K^)^/ia.^+2K^) , i = 1^2; 
a^ =1, ^2 = n-k-1, Xj^ = 6* and X 2 =^< 
Proof : = Prdv^jl > VQ,!t^ 2 ^ 


= 1-Pr(v|j < v|lH*j) 
f. 

I " 

'1 


<S> 

= 1-/ 


dv ^ 


2 f-/2 f,/2 f^/2-1 f,/2-l 

""a c 2 ^ (1-v*) 2 

— rf 7 -^ T /2 

B ( f ^/2 , £ 2/2 )[ ( c ^+( <= 2 “*^ ^ ’'^*1 


Let 


= C2 y*/[ ^ then 

1 - 2 ]^ = Cj ( 1 -’ v ^‘)/[ c 2 ^+( c 2 - Cj ^) V *] 


dz . ^ 2 

and = c- c V[cj^+(c 2 “'^l)''' 3 < 

dv* 


Hence 

o 




B ( fj _/ 2 ^ f 2 / 2 ) 
l-Ij 2 (fj^/ 2 /f 2 / 2 ) / where z = C2^/^^‘^l'^^^2“^l^^a3 ‘ 
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Note that 6^ ~ ®2 ~ ^ implies — If = P—l 

2 

z = v„* Therefore, under H , 

O 


= PrClv^^.l > v^lH^) = 1~I 2 [l/2,(p-l)/2] 

= I [(p-l)/2, 1/2] 

= 2a/[n(n-l)] = o/C”)# 

from equation (2 .3,2) / the value in the null case. 


The exact method vAiich was discussed for u. .'s in Section 
5,2 can also be applied for v. .*s. The distribution of v . . 

i jl ij 

under H ^2 given at equation (5.3,6). Further 


J 


Prdv^jl 


= PrCv^j > 

Equation (5 .2 .13) now gives 

1 (l-v)^’^^"^ dv 


. . OO CO 

p^^'J = s K £ / 

J=0 J i=0 2 

a 


B(i+l/2, j+a/2) 


OO OO 

(5.3.16) = E K. 2 K?. I [j+a/2,^ i+V2] ^ 

j=0 i=0 l-v„ 

Cju 

where a = n— k—1, Kj and KT? are as defined in equations (5.3.7) and 
(5.3.8) respectively. 

• * ^*1 

Table 5.3,1 gives the exact and approximate values of P ' 

2 

for a random sample of size n = 10 from N(M,ct ) distribution for 
different combinations of and 02 and for a = 0.05# vAiere 0=1 
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is taken without any loss of generality. The relative error 
= approx./P'*’' ^ ^ exacts is also tabulated. 

This table shows that the approximate expression for 
P ’ given in Corollary 5,3,1 is satisfactory. Similar results 
hold for other also. Further, the approximate method is 

economical both costwise and timewise. Conseguently it is 
used for evaluation of these probabilities in later sections. 

Approximation fo r PrCu^^^. > obtain an 

approximate expression for Pr(u^j > This is needed, 

since due to rounding errors the exact method is difficult to 
deal with, when 6^ and ©2 large. 

Of o 

In equation (5,3,15), if we take z = * 

c^ = ( a^+2k ^)/( a , f^ = (a^+kj^) /(a^+2k^) for i = 1,2, 

2 2 

^1 “ ^ “ n-k-l, = 6 = M k 2 = h/ vjhere M and Ti 

are defined in equations(5 .2,4) and (5.2,5), then we get an 

2 2 

approximation of Pr(u^j > that is PrClu^jl > 

As shown in Section 5,2/ the non-null distribution of u. . is 
essentially confined to interval [o,l] for large values of li/o 
and hence of 9^^ and 02« Consequently, 

= Pr(u^j > = Pr(4j > 

Exact e^^resslon for PrCu^j > ^ given in 

Corollary 5,2,2, Table 5,3,2 compares the exact and approximate 

1 9 

expression for P * for a randcxn sample of size n = 10 frcm 
2 

N(/J.,d ) distribution for different combinations of 6^ and 6^ 
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and for a = 0,05, where we again take a = 1. Ihe relative error 
is also tabulated. It can be seen that the approximation is 
not good for small values of and In particular it is 

quite bad for = O. However, it is fairly accurate for 

large values of 6 -^ and 02« Further, the relative error keeps 
on decreasing, as increase. Similar conclusions hold 

for other P^'^'s, Due to excessive cost and time consumed for 
evaluating exact value for the appjroximate method is used 

for large values of 6 ^ and ©2 in our performance studies of the 
statistic U, 

5.4, Measures of performance 

We now study the performance of test statistic U and V 
under suitable alternative hypotheses, Iheoretically, the best 
measure of performance is the power function of the test. 
However, this is extremely difficult to evaluate. We therefore 
concentrate on some simple and easy to calculate measures of 
performance. Our measures of performance are analogous to the 
measures proposed and studied by David and Paulson (1965) , 
McMillan (1971) and Joshi (1972), We also assume that a priori 
every pair of observations has an equal chance of being an 
outlying pair, 

5,4,1. Measures of performance of the statistic U 

For this case, the alternative hypothesis is as specified 
in Section 5.2, For 1 < i < j < n, let 

(5.4,1) P^j = Pr(u^j > u^iH^j), and 
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(5.4.2) Q. . = Pr(U > .). 

ij ^ ij 

Since U = Max u. hence we obviously have P. . < Q. •« 
l<i<J<n 

Both P^. j and are reasonable measures of performance. 

However/ one may go a step further/ and consider 


(5.4.3) P-, == Min P- ./ and 

(5.4.4) Q = Min Q. 

^ l<i<j<n 

If Pji^j and Q^j do not depend on i and j/ then P^ = P^^ 
probability that u ^2 significantly large when the 
alternative hypothesis is true. Similarly = 02^2 
power function. 


is the 


the 


Numerical values for P^j can be calculated exactly from 
equation (5.2.15). They can also be approximated by the 
technique discussed at the end of Section 5.3, However, exact 
evaluation of is too complicated. But the first Bonferroni 

inequality can be used to get an upper bound for Qj_j» Hius 

Oij = 

= Pr( Max u. j > u„|H, .) 

l<l,<J,<n HJl 

(5.4.5) < S Pr (u. . > Ua_lH .)/ 

“ l<i^<J3^<n H-»l 

where each term of the sum can be obtained from the non-null 
distribution obtained in Theorem 5.2.2. Obviously an upper 
limit for Q. . is 1. Hence 


<Min 


[l/ S Pr(u. . 

l<i^<jl<n ^-^1 
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5.4«2. Measures of performance of the sta-tistlc V 

Measure of performance for V can be defined in a manner 
similar to that of U. Now the alternative hypothesis is as 
described in Section 5.3. For i j = l,2/.../n, let 

(5.4.6) = PrClv. .1 > 

ij ^ ij 

(5.4.7) Qf. = Pr(V > v^lHf.). 

ij u- rj 

We also consider the measures 


(5.4.8) 

(5.4.9) 


P'2 = Min Pt'.^ and 
Qj = Min diy 


Again if P'l’ . and q 1^. do not depend on i 

1 J ij 

and have similar interpretations as for P^ 
for statistic U, As before, we have 


P*. < qI?., 

ij “ ij 


and 


Q 




13 


Pr( Max IV. . 1 > v^lH* ) 
l<i^<jl<n Hh ^ 


and j, then 

* a. 

and defined 


(5.4.10) < 2 Pr(|v. . I > v„lH^.). 

“ l<±^<3^<n h.h “ 

The non-null probabilities appearing in equations (5.4*6) 
and (5.4,10) can be obtained exactly by using equation (5.3,16), 

2 

5.4.3. Application to a random sample from N(M-.(7 ) distribution 


We now apply our measures of performance to the case of 

2 

a random sample of size n from N(M^,cr ) distribution. Since all 
measures under d^end on the ratios 6^/d and 0j/d# hence 
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we take cr = 1 without loss of generality. 

For the statistic U, now and do not d^end on i 
and j. Consequently and it is possible to evaluate 

lower and upper bounds for the power function, A lower bound 
is of course while an upper bound is given by 

( 5 . 4 . 11 ) 0^2 1 Min 

where from equation (5.4,5)^ we have 

0^2 = PrCu^j > + (n- 2 > > Ua'Hij) 

+ (n- 2 ) + ( 2 ^) PrCUj^ > 

For rotational convenioice/ denote 

Fn(Uij > Uq^|H^ 2 ^ where the dependence on 

suppressed. In this notation P “ ^12 equation (5,4,1), 
Consequently ^ 

( 5 . 4 . 12 ) ^2 = + ^^- 2 ) (P^'^ + + (” 2 ^) P^'^. 

The bound (5.4,11) is useful when Q ^2 ^ 1^ which is the case 
for small values of n and a. 

As shown in Section 3,1, the test for two outliers based 
on U is equivalent to the Murphy's test. For Murphy's test, 

I 

the measure P ^2 been studied by McMillan (1971), Moran 

and Mcr'iillan (1973), vAiile an approximate method for evaluating | 
power C ^2 given by Hawkins (1978), Hawkins has tabilated 
approximate power for n = 10 and a = 0.05. In Table 5,4.1, we ' 

tabulate and min n » lO and 

i 

i 

„ ,, . . t 
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a = 0.05/ evaluated by the exact formula for 0^ /62 = 0 / 1 , . . ./5 . 

For ^ 1*^2 — lower and upper limits for power function Qj ^2 
are given in Table 5.4,2 by approximate method. As pointed out 
in Section 5.3 the approximation is not good for small values 
of 02 _ and 02 • ^or the sake of comparison/ we have also evaluated 
these probabilities by using Monte Carlo techniques. The method 
followed is as follows : 


A sarrple of size 10 from a standard normal population is 
generated. Then the values 0^ and ©2 are added to the first 
and the second observation of the sample. The values u^^j are 
calculated and compared with the nominal percentile point u^. 
obtained from Table 2.3.1, This procedure is repeated N =10,000 
times and the total number of times each of these u. .'s exceeds 

J 

Ujj^ value are counted. The number of times the maximum of these 
u. /s exceeds u^ value is also obtained. Since 

pl'2 = Pr(u^2 > 

hence P ' is calculated by 

. . Number of times Cu- _ > u^,) in N repetitions 

■D^ # 2 - . ...x ^ ^ . . — 

^ = N 


Similarly 

^ Number of times (U > u^) in N repetitions 

Q ' = N • 

However, =P^'^ = ... = P^ hence we evaluate p^''^ by 

counting the total number of times (u^^j > u^) for j = 3 , 4/.,. ,10 

in N repetitions and dividing this number by 8 N, Similarly,using 

the facts that P^'^' for j = 3 , 4 ,... ,10 and P^'^ for 3< i< j < 10 
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0 3 3 4 

are all equal, we can evaluate P * and P ' » This is repeated 
for different values of and These values are tabulated 

in Table 5.4,3, 

A comparison of Tables 5.4,1 and 5.4,3 reveals that the 

upper bound for power given in equations (5.4.11) and (5.4.12) 

is quite close to the simulated value for fact# 

1 2 

P ' itself is quite close to the power value for ~ 

Consequently, one can evaluate the power function approximately 

by finding v * alone or by finding Q ^ , for which some extra 

computations are required. We here point out that the 

approximate power values of Murphy's test given by Hawkins (1978) 

for these values of n and a are much less than our values. In 

fact, our lower bound itself is considerably larger than the 

power values tabulated by him. For example, our values for 
1 2 

P ' and his values within brackets for 0^^ “ ~ ® 

0,704(0,594), 0,911(0,844) for 0=4 and 5 respectively. This 
could be due to the fact that he assumes a conditional 
probability term of his expression as equal to 1, which may not 
be the case. 

Similarly for 0^ = 3, 02 = 5 we have = 0,574, while 

he tabulates 0,481 as power of the test, 

1 9 

A method for finding P ' for Murphy's test is also given 
by McMillan (1971) using non-central t-distribution. However, 
he has not provided any tables, but has provided a curve for 
n = 11 and a = 0,02625. Our values of p^'^ for this combination 
of n and a compare favourably with his values. 
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As pointed out by McMillan (1971) and Hawkins (1978)/ 
Murphy"^ s test performs well if and 62 approximately 

equal. However/ its power deteriorates if 0^^ is markedly 
different from 82 * Our calculations also justify their 
statement. In fact ,if = 0, that is, if there is only one 
outlier then the power of the Murphy^s test (for n = 10 and 
a = 0.05) first increases, attains a maximum value around ©2 = 
and then decreases to zero. 

For the statistic V, which is equivalent to Studentized 
range test statistic, P* . and Q'| . do not depend on i and j. 
Consequently = Q* and it is possible to evaluate lower and 

upper bounds for the power function. Now P ^2 ^ lower bound, 

while an upper bound is given by 


(5.4.13) C ^2 1 Min 

where from eqiation (5.4.10) we have 


0^2 =PrClVj^2‘ ^ ^ 

+(n-2)Pr(lV2 3l > ^ '^a'^2^* 

For notational convenience, we denote Pr(|v^jl 
by In this notations =s p*^ of equation (5.4.6). 

Consequently, 


(5.4.14) ^*2 = + (n-2) (P^^'^ + P*^'^) + • 

Similar to the case of U, we tabulate 
P*^'^ and min(q'^ 2 '^^ n = 10 and a = 0.05 evaluated by the 



x:>D 


exact fonttula for ~ 0,1/, ..,5 in Table 5.4.4. Bounds for 

power function calculated by approximate method are given in 

* « 

Table 5.4.5 for * Although for individual terms 

the approximation is quite satisfactory, yet for small values 

of 0 i #02 good for 0 ^ 2 ' since p*2/3 p^3,4 

are multiplied by (n-2)/(n-2) and { 2 ) respectively. For 

larger values of 0 ^^ and ©2 r P^' * is the dominating term in 

equation (5.4.14) and the difference between exact and 

approximate values of is small. For example, 0j^ = ®2 ~ 

“ '*-1 . 2 

exact and approximate values for Q'' ' are 0,7117 and 0,7113 
respectively. Even for unequal values of 0-j_ and © 2 / the 
difference is not much; for example, for 0 ^^ = 2 , ©2 =5/ we have 
the exact value equal to 0,3524, while the approximate value is 
0,3425. As stated in Section 5.3, the approximate method is 
much easier and hoiice one can evaluate the power by this method 
for larger values of ©j_ and 82 . However, for small values of 
© 1 , 02 / one may still have to calculate the power by exact method, 
for obtaining better results. For the sake of comparison, we 
have evaluated these probabilities ty Monte Carlo technic^e also. 
The method followed is exactly similar to that of U, The power 
and other values are tabulated in Table 5.4.6 for 0 j^,e 2 =Ori/ • • • #5 <■ 
From Table 5.4,4 and Table 5,4,6, it is clear that the upper 
limit for power function is close to simulated value of power 
for 0]_+02 — 


As pointed out for the case of U, the power of statistic V 
also deteriorates if is markedly differait from e^. In this 


In this 
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case also if 0^ = 0, that is ,if there is only one outlier, 
then the power of the Studentized range test ( for n = 10 and 
a = 0,05) first increases, attains a maximum value around 
©2=5 and then decreases to zero, 

5,4,4, Application to a two-way layout 

We need the following lesimia which is similar to Corollari’ 
5,2.2 for obtaining a measure of performance of the U statistic 
for a two-way table. Again the measures depend on the ratio 
0^/a and 0j/a. Consequently, we can take 0=1 without any 
loss of generality. 

Lemma 5,4,1, For u^^ > 0, 


Plj = 

oo oo 

= 5 - r K. S I 2 C ji+a/2,(i,+l)/2] , 

^ 3^=0 ^1 Lj _=0 H 1 - U ^ ^ 

where a = n-k-l, and are as defined in equations (5,2,9) 
and (5.2.10) with 


(5.4.15) 6 = [(l+Pij)/2] C^Y± ®i 

(5.4.16) n = [(l-p^j)/2] (kY± S± 

0 . > O and 0.. >0 are the deviations of the ith and 1th 

X j ■■■' “ 

observations from their mean under H^, 

Proof : The proof of this lemma is similar to Theorem 5,2,2. 
Now, we have from equations (5 ,1,3) and (5. 2*3), 




158 


JLI= E(t. .!H. .) = c' E(y) 

Xj IJ <V> 


c'(x 0 + e. 0. + e . e .) 

fO <sJ <s> |\>J J 



ro 


ep. 


where e. and e . are the ith and jth coliamn vectors of the 

<foX coj 

identity matrix of order n^ and 



[2(l+p .)] 


[(1+P„) Si Sj) ] 


= [(l+Py)/2]^^^ 9i + Sj) 


Since 6 = M r we have the expression for 6 as given in 

<r 

(5 •4.15) • Further^ with X* as in equation (5 •2*1) and 6 = 
we have (for 0=1) 


T) = X* - 6 


= [(i-p^i)/23 (Kl{^e± - 
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On using Theorem 5. 2.1 and proceeding as in Corollary 5.2 »2/ 
the lemma follows immediately. 

Remark 1 . If we have, =X(i = 1/2/. ../n), then the 

expression of 6 and T] reduce to 

6 = [(l + Pij)/2] X and 

T)= [(l-Pij)/2] X (6^-6^)^. 

The probability P^j given in Lemma 5.4.1 depends on 6 

and T) apart from ^a^ and For fixed and Uq., it is not 

easy to determine theorejbically the behaviour of ^ij as a 

function of 6 and V , since the weights K. and K"^ / vAiich are 

Jl ^ 

like Poisson probability terms, have to be multiplied with 
incomplete beta integrals. Then the entire series has to be 
summed over and i^ from 0 to <». First few terms of this 
series are given by 




where b. 


x^/Jl 


= I 


1-u: 


[ Jj^+a/ 2 ,Ci 3 _+l )/2 ] , and it is very difficult 


to determine the behaviour of ^ function of 6 and T) » 

From limited calculations we have observed that 


( i) for fixed , P. • increases as 6 increases, and 

(ii) for fixed 6 / decreases as increases. 

Now for a two-way table the probability P^j defined in 
eguation (5.4.1) depend on i and j. We therefore consider the 
measure Pg^ introduced in equation {5.4.3). Hie other measure 



l&U 


defined in ecjuation (5.4.4) is extremely difficult to calcxxLate, 
For evaluation of we need the following theorem, T^ghich is 
proved by making note of the observation mentioned in Remark 1 
above. 


Theorem 5.4»1 » In a two-way classification with single 
observation per cell and having r rows and c colximns, 

P . . = PrCu. . > u„|H. .) is minimum when the two cells in which 

IJ ij u IJ 

the two suspected observations are occuring are ( 1 ) in the same 
row or column if r = c, ( 2 ) in the same colrrmn when r < c, and 
(3) in the same row when r > c. 


Proof : Let the two suspected observations be in the ith and 1th 
cell. From Lemma 5.4.1/ we have 


6 = [(l+Pj^j)/ 2 ] K and 

V = [(i-p^j)/2] X 

where X = ( r- 1 ) C c- 1 ) /rc is the common value of ^L 1 for a two-way 
table. For a fixed 6^^ and Gj, 6 and 1) would vary according to 
(l+p^j)/2 and (l-p^j)/2. Thus 5 would be minimiam and rj would be 
maximum for a minimum value of Since 

-l/(c-l) , if the cells i and j are in the same row, 

-l/(r-l) if the cells i and j are in the same columni 

1 / [ ( r- 1 ) ( c- 1 ) ] if the cell s i and j are neither in the 
same row nor in the same column, 

hence for the minimum value of "the two outlying cells must j 

be in the same row or column according as r > c or r < c. For ! 
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r = c, tiie two cells can be in the same row or column. Biis 
completes the proof of the theoran. 

Now for a measure of performance of the statistic V in 
case of a two-way table, we recjuire the following lemma, which 
is analogous to equation (5.3.16). 

Lemma 5.4.2. For > O, 


= Pr (iv. .1 > VQ.im%) 

oo oo 

= E K E k;‘ I [j- +3/2,% +1/2] , 

j ^=0 ^1 %=0 2 % ^^^2 L 1 i . J 

where a = n-lc-1 , K. and K* are as defined in equations (5 .3.7) 

^1 H 

and (5.3.8) respectively with 


(5.4.17) 6* = [(l-P.,-)/2] .)^, and 

J- j Xi J J J 

(5.4.18) n» = [(l+Pj^j)/2] 

Proof : Ihe proof of this lemma is similar to Corollary 5.2.1. 
Now we have from equations (5.1.3) and (5.3.2) 


E(#j_j|H*^.) = d' E(y) 


CO 


= d' (X 3 - 0^ + s. 0.) 

ro fO «o CO coJ o 

= a' Sj - ^ i ' 


where 

d' 
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ja* 


1 



X; s .0 . X' .. e .9 . 




1 r-f^ij 

[ 2( l-PjLj) Xj^^ 


X-^) 


Tn 

^jj 


1 / 2 ^ e , - ( xi /2 

1 33 




XX 


= - [(l-P^j)/2] 


1/2 (j^l/2 


, 1/2 


XI 


®i + ®J> 


o 

Since 6 * =M“ whence 


6 «- =[{l-P^j)/2] (X^{^ 0i + Xj^^ Sj)^. 

Further/ with as in equation (5.3.1)/ and 6 * =P'*^/ 

we have 


71 * 


X**- 6 * 




[(l-Pij)/ 2 ] 




\ 1 / 2 q y 

e^) 


Cl- ^)(x.^e| + Xjje5)-2CPy + 


= [(i+Pij)/2] cx^/^ei - 


On using Theorem 5.2*1 and proceeding as was done for 
obtaining equation (5.3.16)/ the lemma follows immediately. 

Ranark 2 . When Xj^j[ = X # th® expression of 5 * and 7?* are 


6 * = [(l-Pij)/2] X (©i+ej)^ 
r)» = [{1+P^j)/2]X 
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Similar to the case of U, as mentioned In Remark 1, 
theoretically the behaviour of as a function of 6 * and 
is difficult to determine* Hence again^ frcan limited calculations/ 
we observe that 


( i) 

for 

fixed 


t p*- 

increases as 

6 * 

increases 

(ii) 

for 

fixed 


' ^ij 

decreases as 


increases 


Now for the measure of performance of V given by 
equation (5.4.8)/ the alternative hypothesis to be considered 
is given by the following theorem/ which is proved by making 
note of the observation mentioned in Remark 2 above. 

Theorem 5.4.2 * In a two-way classification with single 
observation per cell and having r rows and c columns, the P*j 
value given in Lemma 5.4.2 is minimum when the two cells in 
which the suspected outliers are lying are neither in the same 
row nor in the same column. 

Proof : Suppose the two outlying observations are lying in 

the ith and jth cells, then as mentioned in Rsnark 2, for P'^ . 
to be minimum, we need 5 * to be minimum and to be maximum. 

For fixed and 62 / 6 * and would vary according to 

(l-p. .)/2 and (1+P- .)/2. Thus 6 ^' would be minimum and 

1 

would be maximiam when assumes its maximum value c^T * 

This happens when the two cells in which the suspected 
observations are lying are neither in the same row nor in the 
same colxomn. This completes the proof of the theorem. 



164 


For the sake of ccanparison in Section 5*5# we tabulate 
approximate values of = p*li»22 tables, 

for a = 0.02625 in Table 5.4.7, for 0 < 9^ < 02 < 7. 

Similar to the case of a random sample from noimial 
distribution, here also, the performance of test statistic V 
is not very good when 6-^ is much smaller compared to 02 • However, 
for same values of ^1*^2 statistic performs better for 

6x8 table compared to 4x5 case. 

11 7l 

The measure P_ = P ' (for r < c case) studied for 

a 

4x5 and 6xB tables gives analogous results for the statistic U, 
5.5 » Comparison with sequential procedure 

For comparing the performance of our procedure, we 
consider the sequential procedure. The procedure for a randcan 
sample from N(ji,a ) as discussed by McMillan (1971) and 
McMillan and David (1971) is as follows : 


Suppose yi'y2'***'^n independent, normally distributed 

2 

observations with common variance 0 and in the absence of 
outliers, (H^) common mean )JL • Let 1, y( 2 ) — •** — ^(n) 

the corresponding order statistics, y = Eyj|/n and S = BCy^-y) . 
If y^^^ - y > then y^j^^ is declared an outlier and the 

test is repeated using* the remaining observations. 


y(n-l) “ ^(n) ^ 


vftiere 


n-1 


’(n) 




n-1 


i=l 


i=a. 
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/ 

then is also declared an outlier^ etc. Values of ^ 

such that Pr{y^^^*-y > ^ given hy Quesenberry 

and David (1961) ^ and related constants are tabulated by Grubbs 
(1950)/ and Grubbs and Beck (1972). 

To evaluate the performance of this procedure, we suppose 

that two of the observations, which without loss of generality 

we take as come from a different normal population than 

the rest of the sample. Specifically, we assume without loss 

2 (3. 2 

of generality that y^ = ) and y^ = N(jLL+ 62/<7 ), where 

®1'®2 ^ this procedure we then calculate the probability 

(5.5.1) = Pr {significance of both y^^ and Y 2 ^ steps} • 

As pointed out in McMillan and David (1971), the Murphy's 

2 

procedure having a significance level (a+a )/2 has to be 

compared with the sequential procedure having a significance 

level a, so that under the expected number of observations 

declared as outliers is equal for both the procedures, McMillan 

(1971) and Moran and McMillan (1973) have compared this measure 

Pj^ with the measure p defined in last section. In addition 

to some other values, they have tabulated the values of P^ for 

n = 21, cx = 0,05 and = ©2 case. Methods for evaluation of 
1 2 

P ' by exact, approximate and Monte Carlo techniques are 
discussed in the last section. These are tabulated in Table 5.5.1 
for n = 2l, <1 = 0.02625, >diere Monte Carlo values are based on 
2000 iterations. It is worth mentioning that in Table 5.4.1 of 
last section, we have used exact critical value for Murphy's test. 
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But now we have only a nominal percentage point Uq., Consequently 
the values of P ' as tabulated provide only a lower bound for 
exact values. Of course, since n and a are small, hence the 
difference is only marginal. 

1 2 

Since for P ' , we have used nominal percentage points, 
hence for a proper comparison, we also use nominal percentage 
points for the evaluation of Pj^ instead of exact percentage 
points, Ncxninal percentage points can be obtained from tables 
prepared by Srikantan (1961), Joshi (1975), Lund (1975), 

Doombos (1980), and others. They can also be obtained directly 
by the methods described in these papers. We here point out 
that for n = 20, 2l and a = 0,05, the nominal percentage points 
obtained from Doombos agree perfectly with the exact values 
tabulated by Grubbs and Beck (1972), The method given in Moran 
and McMillan (1973) for the evaluation of P^^ by numerical 
integration is cumbersome. For our purposes we evaluate Pj^ 
for a = 0,05 and n = 21 by using Monte Carlo method, basing our 
results on 1000 iterations. These values are also tatulated 
in Table 5,5.1, 

The simulated values of P^^ using ncxninal percentage points 
agree reasonably well with the values of P^ tabulated by Moran 
and M«Millan (1973) using exact percentage points. For example, 
for ~ exact value of P^ is 0.421, while that 

obtained by simulation is 0,419, Similarly for 0^^ ” ®2 ~ 

Pj^ (exact) = 0,759 and Pj^ (simulated) = 0,746, 
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From Table 5.5.1 we see that the Murphy's test perform 
much better than the sequential procedure/ even if and are 
not equal, McMillan (1971) has come to the same conclusion for 
= 02 case. 

Corresponding results for the V statistic/ which in this 
case is internally Studentized range statistic/ for n = 20 and 
significance level 0.05 are tabled in Table 5.5.2. We now tabulate 
P”''" ' for cc. = 0,02 625 by approximate method only/ while values 
for a = 0,05 are obtained by simulation method with 1000 iterations. 

For this case also/ the V statistic performs much better than 
the sequential procedures for all values of and 02 considered. 

4 ' 1 7 

In fact/ the difference between P * and P^^ is substantial even 
if 0 j|_ ^ ^ 2 * 

We next consider a two-way layout with r rows and c columns. 

For simplicity we shall consider the case r < c in detail. Now 

from Theorem 5,4,1/ we see that P = P ' / that is/ the minimum 

^ 1 ^* 1 ' ^ 2^*2 

value of P occurs when the two outliers are in the same 

column. The values of for a = 0,02625 for 4x5 and 63 B 

tables are tabulated in Table 5,5.3 for 0^^ = ©2 = 0 case. The 
values for 0 < 5 are obtained by exact method/ while for larger 
values of 0 / these are obtained by approximate method. 

For comparison/ we also tabulate P^ values obtained by 
Monte Carlo techniques. Now G is added to observed values of 
cells ( 1 / 1 ) and ( 2 / 1 ) and the data is tested for an outlier. 



After one outlier is detected/ we remove that observation and 
reanalyse the data for a second outlier. Again the nominal 
percentage points obtained from Doombos (1980) can be used. 

In this manner we evaluate for a = 0.05 for 4x5 and 6x8 
tables by simulation using 2000 iterations. 

From Table 5.5.3 we see that the performance of the U 
statistic is remarkably higher than that of the sequential 
procedure/ especially for 4x5 table/ for which the sequential 
procedure performs very poorly. 

For the statistic V, we see that the minimxom values of 
^1 1 ^ ^2 ^ 2 

P* occurs when the outlying obsearvations are neither 

V "I o o 

in the same row nor in the same colxomn, that is P* = P'” » 

These values of tabulated in Table 5.5.4 for 

a = 0.02625 for 4x5 and 6x8 tables and for 0^ ~ ^2 “ These 

are obtained by approximate method as described in the 
calculations for Table 5.4,7 of last section. 

Again for comparison/ we tabulate Pj^ values obtained by 
Monte Carlo method. In this case 0 is subtracted from the 
observed value of cell (1/1) and 0 is added to the observed 
value of cell (2/2). Computations are similar to that of U 
statistic. 

Here again, we see from Table 5.5.4 t^at the statistic 
V performs considerably better than the sequential procedure/ 
especially for 4x5 table. 
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Thus in all cases considered our proposed statistics 
U and V for two outliers perforin better than the secjuential 
test statistic. 
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T^-iBLE 5.3.1. Exact, and approximate values of ^ for a rando m 

sample of size n = 10 and cx = 0.05. 


CD 


P ■ exact 

-''-1 , 2 

P" approximate 

Relative 

0 

0 

0.00111 

0,00111 

0.000 

0 

1 

0.002 69 

0.00273 

-0.015 

0 

2 

0.00734 

0.00737 

-0,004 

0 

3 

0.01341 

0,01284 

0.043 

0 

4 

0.01789 

0,01640 

0,083 

0 

5 

0.01915 

0.01702 

0.111 

1 

1 

0.01258 

0,01274 

-0.013 

1 

2 

0.03571 

0.03545 

0.007 

1 

3 

0,06683 

0,06548 

0,020 

1 

4 

0.09230 

0.08957 

0.030 

1 

5 

0.10394 

0,10022 

0,036 

2 

2 

0,10266 

0.10089 

0.017 

? 

3 

0.19425 

0.19137 

0.015 

2 

4 

0.27274 

0,26973 

0.011 

2 

5 

0,31629 

0.31387 

0,008 

3 

3 

0.36765 

0.36443 

0.009 

3 

4 

0,51476 

0,51285 

0.004 

3 

5 

0.60011 

0.59979 

0.001 

4 

4 

0,70998 

0.71031 

0.000 
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Table 5.3.2. 


1 2 

Exact and approximate values of P * for a 
random sample of size n = 10 and a = 0.05. 


®1 

®2 

^ exact 

1 2 

P * approximate 

Relative 

0 

0 

0.00111 

0,00222 

-1 .000 

0 

1 

0.00416 

0.00435 

-0.046 

0 

2 

0.00925 

0.00918 

0,008 

0 

3 

0,01360 

0,01294 

0.049 

0 

4 

0.01478 

0.01350 

0.087 

0 

5 

0.01297 

0,01146 

0.116 

1 

1 

0,01744 

0.01747 

-0,002 

1 

2 

0,04300 

0.04236 

0.015 

1 

3 

0,06994 

0.06814 

0,026 

1 

4 

0.08437 

0.08135 

0.036 

1 

5 

0.08319 

0.07949 

0.044 

2 

2 

0.11575 

0.11312 

0.023 

2 

3 

0,20313 

0,19929 

0.019 

2 

4 

0,26402 

0.26015 

0.015 

2 

5 

0.28339 

0.28002 

0,012 

3 

3 

0.37602 

0.37182 

0.011 

3 

4 

0.50979 

0.50723 

0.005 

3 

5 

0,57368 

0.57278 

0.002 

4 

4 

0.70405 

0.70437 

0.000 


arror 
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TABLE 5.4.1. ^ ^ and min values for 

n = 10 and a = 0.05 obtained by exact method . 


01 

CD 

CO i 

pl -2 

pi <3 

p 2/3 

p 3,4 

Min 

0 

0 

.0011 

.0011 

.0011 

.0011 

.0500 

0 

1 

.0042 

.0012 

.0042 

.0012 

.0807 

0 

2 

.009 3 

.0007 

,009 3 

.0007 

,1082 

0 

3 

. 01 36 

.0002 

.0136 

,0002 

.1304 

0 

4 

.0148 

.0000 

.0148 

.0000 

.1344 

0 

5 

. 01 30 

.0000 

.0130 

.0000 

.1169 

1 

1 

.0174 

.0022 

.0022 

.0014 

.0914 

1 

2 

.0430 

,0006 

,0051 

,0009 

,1136 

1 

3 

.0699 

.0001 

,0080 

,0003 

,1430 

1 

4 

.0844 

.0000 

.0090 

.0001 

.1581 

1 

5 

.0832 

.0000 

.0081 

,0000 

.1482 

2 

2 

.1158 

.0015 

.0015 

,0006 

.1579 

2 

3 

.2031 

.0003 

,0027 

.0003 

.2332 

2 

4 

.2640 

.0000 

.0033 

,0001 

.2919 

2 

5 

.2834 

,oooo 

.0031 

,0000 

,3087 

3 

3 

.3760 

,0005 

.0005 

.0001 

.3863 

3 

4 

.5098 

.0001 

.0007 

,0000 

.5163 

3 

5 

.5737 

,0000 

,0007 

.0000 

.5796 

4 

4 

.7041 

.0001 

,0001 

,0000 

.7055 

4 

5 

.8031 

.0000 

.0001 

.0000 

,8039 

5 

5 

.9106 

.0000 

.0000 

.0000 

,9107 
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TABLE 5.4*2 . Approximate upper limit (top row) and lower llra lt 
(bottom row) for the power of Murphy^ s test for 
n = 10, a = 0,05. 


\ e. 




0 

1 

2 

3 

4 

5 

0 


0.100 








0.002 






1 


0.101 

0.100 







0.004 

0.017 





2 


0.109 

0.111 

0.150 






0.009 

0.042 

0.113 




3 


0,122 

0.135 

0.224 

0,380 

• 




0.013 

0,068 

0.199 

0.372 



4 


0.122 

0.147 

0,283 

0,512 

0.705 




0.014 

0.081 

0.260 

0.507 

0.704 


5 


0,103 

0.136 

0.301 

0.577 

0,805 

0.916 



0.011 

0.079 

0.280 

0.573 

0,805 

0.916 

6 


0.076 

0.109 

0.283 

0.587 

0,839 

0.953 



0.008 

0,068 

0.267 

0.583 

0.8 38 

0.953 


7 


0.049 0.079 0.245 0,560 0.837 0.962 0.993 0.999 

0.005 0.052 0.234 0.558 0.837 0.962 0.993 0.999 
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TABLE 5.4.3. ^ and values for n = 10 and 

cc = 0,05 obtained by Monte Carlo method — — — 


CD 

Hfc 

CM 

CD 

pi, 2 

pi. 3 

p ^.3 

p 3, 4 

®12 

0 

0 

.0011 

.0013 

,0012 

.0013 

.0519 

0 

1 

.0033 

.0005 

.0041 

.0005 

.0599 

0 

2 

.0102 

.0001 

.0100 

.0001 

,0946 

0 

3 

.0145 

.0000 

.0140 

,0000 

,1275 

0 

4 

.0161 

,0000 

.0149 

.0000 

.1351 

0 

5 

.0145 

,0000 

.0128 

.0000 

.1170 

1 

1 

.0180 

.002 3 

.0024 

,0003 

.0640 

1 

2 

,0449 

.0007 

,0056 

.0001 

.0972 

1 

3 

.0746 

.0001 

.008 3 

,0000 

.1426 

1 

4 

.0382 

.0000 

.0092 

,0000 

.1619 

1 

5 

.0352 

.0000 

.0083 

.0000 

.1513 

2 

2 

.1156 

.0016 

.0018 

,0000 

.1433 

2 

3 

.2032 

.0004 

,0028 

,0000 

.2337 

2 

4 

.2653 

,0000 

,0033 

.0000 

,2922 

2 

5 

.2846 

.0000 

.0032 

,0000 

.3100 

3 

3 

.3761 

,0006 

.0005 

.0000 

,3847 

3 

4 

.5103 

,0001 

,0007 

.oooo 

.5171 

3 

5 

.5753 

.0000 

.0007 

.0000 

.5808 

4 

4 

.7048 

.0002 

.0000 

.0000 

.7066 

4 

5 

.8029 

,oooo 

,0001 

,0000 . 

,8033 

5 

5 

.9124 

.oooo 

,0000 

.oooo 

.9126 
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TABLE 5.4.4, 


p-/rli?^p.--l /3^p.;.2 /3^p.;'c.3,4 values 

for n = 10 and a = 0.05 obtained by exact me-tho d . 


tH i 

CD 1 

^2 

p - x - 1.2 

P ..- 1.3 

p - x - 2 , 3 

p - x - 3,4 

Min 

0 

0 

.0011 

.0011 

.0011 

.0011 

.0499 

0 

1 

.002 7 

.0008 

.002 7 

.0003 

.0524 

0 

2 

.0073 

.0003 

.0073 

.0003 

.0556 

0 

3 

.01 34 

.0001 

.0134 

.0001 

.1224 

0 

4 

.0179 

.0000 

.0179 

.0000 

.1612 

0 

5 

.0192 

.0000 

.0192 

.0000 

.1724 

1 

1 

.0126 

.0018 

.0018 

.0005 

.0556 

1 

2 

.035 7 

.0006 

.0047 

.0002 

.0831 

1 

3 

.0668 

.0001 

.0084 

.0000 

,1360 

1 

4 

.0923 

.0000 

.0111 

.0000 

.1814 

1 

5 

.1039 

.0000 

.0118 

.0000 

.1984 

2 

2 

.1027 

.0017 

.0017 

,0001 

,1305 

9 

3 

.1943 

.0003 

.0030 

.0000 

.2212 

2 

4 

.2727 

.0000 

.0041 

.0000 

.3059 

2 

5 

.3163 

.0000 

,0045 

.0000 

.3524 

3 

3 

. 3677 

.0006 

.0006 

.0000 

.3775 

3 

4 

.5143 

.0001 

,0009 

.0000 

.5224 

3 

5 

.6001 

.0000 

,0010 

.OOOO 

,6085 

4 

4 

.7100 

.0001 

,0001 

.OOOO 

.7117 

4 

5 

.8130 

.0000 

,0001 

.OOOO 

.8187 

5 

5 

.92 32 

.0000 

.OOOO 

.OOOO 

.9233 
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TABLE 5.4.5 


Ap-proximate upper limit (-top row) and lower 
Ij-nit (bottom row) for t±te power of Studentized. 
range test for n = 10. a = 0.05. 


0 

1 

2 

3 

4 

5 

6 

0.050 







0.001 







0.052 

0,054 






0.003 

0.013 

. 





0.074 

0.080 

0.125 





0,007 

0.035 

0.101 





0.116 

0.129 

0.214 

0.371 




0,013 

0.065 

0,191 

0.364 




0.143 

0.168 

0.297 

0.518 

0.711 



0.016 

0.090 

0.270 

0.513 

0,710 



0.153 

0.181 

0.343 

0.606 

0.819 

0,923 


0.017 

0.100 

0.314 

0.600 

0,818 

0,923 


0.138 

0.171 

0.351 

0.640 

0,863 

0.961 

0.989 

0.015 

0.098 

0.325 

0.635 

0,862 

0.961 

0.989 

0.112 

0.012 

0.146 

0.088 

0.334 

0.313 

0.640 

0,635 

0,875 

0.874 

0.972 

0,971 

0,995 

0.995 


0.999 

0.999 


7 
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TABLE 5 

t.6. P'‘-^ 

/ ^ ri' ^ ^ ^ - 

ir / ^ 

3 „*3,4 , 

, P and 

o* 

^12 

values for 

n = 

: 10 and a = 0, 

,05 obtained by 

Monte Carlo 

method 





Qi 

®2 

p-;c-l/2 

P-1.3 

pir 2/3 


p-x-3,4 

°12 

0 

0 

.0009 

.0011 

.0011 


.0011 

.0494 

0 

1 

.0035 

,0009 

.0029 


.0008 

.0553 

0 

2 

.0089 

.0002 

.0078 


.0003 

.0804 

0 

3 

,0138 

.0001 

.0140 


.0000 

.1275 

0 

4 

.0184 

.0000 

.018 3 


.0000 

.1646 

0 

5 

.0208 

.0000 

.019 3 


.0000 

.1751 

1 

1 

.0130 

,0016 

.0017 


,0005 

.0526 

1 

2 

.0359 

.0006 

.0050 


.0002 

•C850 

1 

3 

.0670 

.0001 

.0089 


,0000 

.1396 

1 

4 

.0970 

.0000 

.0117 


,0000 

.1903 

1 

5 

.1100 

.oooo 

.0123 


.0000 

.2081 

2 

2 

,0998 

.0016 

.0017 


.0000 

.1268 

2 

3 

.1989 

.0003 

.0032 


.oooo 

.2265 

2 

4 

.2757 

.0000 

.0042 


.oooo 

.3991 

2 

5 

.3168 

.OOOO 

.0047 


.oooo 

.3542 

3 

3 

.3770 

.0006 

.0006 


.0000 

. 3865 

3 

4 

.5179 

.0000 

.0009 


.0000 

.5255 

3 

5 

.5994 

,0000 

.0011 


,0000 

.6079 

4 

4" 

.7121 

.0001 

,0001 


.0000 

.7134 

4 

5 

.8213 

.oooo 

.0001 


.0000 

.8224 

5 

5 

.9237 

.oooo 

.OOOO 


.oooo 

.92 39 
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TABLE 5.4.7, Approximate values of P’^ = 

4x5 (top row) and 6x8 tables (bo ttoin row) /■ 
for a = 0,02 625 • 


•X'‘ 

0 

1 

2 

3 

4 

5 

6 

7 

0 

0.000 









0,000 








1 

0.000 

0.001 








0.000 

0.001 







2 

0.001 

0.004 

0.011 







0.001 

0.005 

0,013 






3 

0.002 

0.008 

0.025 

0.057 






0.003 

0.015 

0.051 

0.130 





4 

0.002 

0.012 

0.041 

0.101 

0.188 





0.008 

0.037 

0.114 

0.261 

0,459 




5 

0.002 

0.014 

0.055 

0.144 

0.278 

0.421 




0.019 

0.075 

0.209 

0.423 

0.65 3 

0.828 



6 

0.002 

0.015 

0.062 

0.173 

0.347 

0.534 

0.682 



0.036 

0.131 

0,326 

0.583 

0.802 

0.927 

0.978 


7 

0.002 

0.013 

0.061 

0,184 

0,384 

0.603 

0.773 

0.372 


0.062 

0.202 

0.449 

0.715 

0,894 

0.972 

0.994 

0.999 
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1 9 

TABLE 5 . 5 . 1 • Exact, approximate and simulated values of P 

for the statistic U with a = 0,02625 and simulated 
values of for the sequential test with a = 0,05 

and n = 21 . " ~ 


®1 

CM 

CD 

P ' ' exact 

^ approximate 

P”^ ^ simulated 

D 

simulated 

0 

c 

0.000 

0.000 

0.000 

0.000 

0 

1 

0.001 

0.001 

0.001 

0.000 

0 


0.004 

0.004 

0.002 

0.001 

0 

3 

0.010 

0.011 

0.012 

0.003 

0 

4 

0.022 

0.023 

0.019 

0.001 

0 

5 

0.036 

0.036 

0.035 

0.002 

1 

1 

0.006 

0.006 

0.005 

0.002 

1 

2 

0.022 

0.023 

0,023 

0.000 

1 

3 

0.057 

0.057 

0.056 

0.012 

1 

4 

0.109 

0,107 

0.105 

0.020 

1 

5 

0.167 

0.164 

0.152 

0.020 

2 

2 

0.078 

0.077 

0.073 

0.017 

2 

3 

0.182 

0.177 

0.180 

0.046 

2 

4 

0.311 

0.304 

0.317 

0,094 

2 

5 

0.433 

0.427 

0.430 

0.119 

3 

3 

0.376 

0.368 

0.377 

0,132 

3 

4 

0.572 

0.567 

0.578 

0.244 

J 

5 

0.718 

0.718 

0.721 

0,342 

4 

4 

0.779 

0.781 

0.771 

0,419 

4 

5 

0.895 

0.900 

0.900 

0.581 

5 

5 

0.970 

0.971 

0.973 

0.746 
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TABLE 5.5.2. Approximate value of for the statistic V 

with a = 0,02625 and simulated value of for 

the sequential test with a =0.05 and n =20. 


e, 


e. 


P 


*>5*1/2 


approximate 


P, simulated 
o 


0 

0 

0 

0 

o 

0 

1 
1 
X 
1 
1 
2 
2 
2 
2 
3 
3 

3 

4 
4 


0 

1 

2 

3 

4 

5 
1 
2 

3 

4 

5 
2 

3 

4 

5 

3 

4 

5 

4 

5 
5 


0.000 

0.001 

0.003 

0.009 

0.019 

0,032 

0,005 

0.018 

0.048 

0,094 

0.150 

0.065 

0.157 

0.281 

0.405 

0.342 

0.542 

0.699 

0.763 

0.889 

0.967 


0,000 

0.000 

0,000 

0.000 

0.002 

0,002 

0.000 

0.002 

0.002 

0.011 

0.017 

0.006 

0,015 

0,049 

0.088 

0.074 

0.159 

0.251 

0,317 

0.492 

0.630 


5 
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TABLE 5.5,3, The values for the statistic U wi-th 

a = 0.02 625 and simulated values of for 

the sequential test with a = 0.05 for two- 
way tables. The values shown v/ith asterisks 
denote exact values . 


e 


0 

1 

2 

3 

4 

5 

6 

7 

8 


4x5 table 6>S table 


pi 1/21 

^b 

pll/21 

^b 

0,000 

0.000 

0.000* 

0.000 

0.002 

0,001 

0.001'“^ 

0.001 

o.oir-' 

0.003 

0,015* 

0.005 

0.048“ 

0.004 

0.113* 

0.017 

0.1 5 O''' 

0.005 

0.395* 

0.078 

0.336'“' 

0.005 

0.747* 

0.203 

0.569 

0.007 

0.951 

0,394 

0.780 

o,oce 

0.996 

0.533 

0.914 

0.003 

1.000 

0.651 

0.974 

0.010 

1 ,000 

0.763 


9 
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TABLE 5»5»4» The for the statistic V wit±t 

a = 0,02625 and simulated values of Px^ for 
h 

the sequential test x^rith a = 0,05 for two- 

way tables. 


0 


0 

1 

2 

3 

4 

5 

6 

7 

8 


4x5 table 


p-11,22 

^b 

0.000 

0,000 

0.001 

0.000 

0.011 

0.000 

0.05 7 

0,003 

0.188 

0.005 

0.421 

0.014 

0.682 

0.026 

0.872 

0.035 

0.963 

0.040 

0.992 

0.039 


6x8 table 


p-:t-11^22 

^b 

0.000 

0.000 

0.001 

0.000 

0,018 

0,001 

0.130 

0.012 

0.459 

0,099 

0,828 

0.288 

0.978 

0.533 

0.999 

0.720 

1.000 

0.836 

1.000 

0.910 
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CHAPTER VI 


EXTENSION FOR MORE IHAN TWO OUTLIERS 
6,1, Introduction 

In previous chapters we were dealing with the case when 
only two outliers were present. For three or more outliers^ 
earlier work has been done by Rosner (1975)/ Draper and John 
(1980)/ Gentlonan (1980) etc, Bradu and Hawkins (1982) has 
also discussed this multiple outlier case using tetrads. 

We now extend our results for three or more outliers for 
one sided case/ that is/when all outliers are in the same 
direction. The statistic is introduced in Section 6,2 and 
distribution theory results are given in Section 6.3, Nominal 
upper percentage points have been tabulated for three-outlier 
case in Section 6,4, The performance of test statistic is 
studied in Section 6,5, 


6,2, Motivation of the statistic 

The u. /s for the one-sided statistic U in two-outlier 
3-J 

case are defined by 


^ij “ 


where t^^ = S^ (Wj^+Wj)/[2(1+P^j)]^'^^ 


3' JJ 


has the variance equal to cJ • 
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VJhen three outliers are present on the right, we define 
the test statistic for detecting them as follows : 

Let 


+ Wj), and 

(6.2.2) + wi + «j5 ' 


where Wj_ and S are defined in Section 2*1* We now choose Cj^ 


11 


so that the variance of 
( 3 ) 

U for three outliers by 
.(3) 


ij 


is O’ , and define the statistic 


U' 


Max 
l<h<i<j<n 


u 


hij' 


Now, 


Var = var [Ohij( ] 


1/2, 


1 / 2 - 


= Chi/ (l+l+l+2P^i + 2P^j + 2Pi/. 


Equating it to we iimxediately get £«c f in.i'te ^xj > 


^nij = l/[3+2(Pj^i + + ^ij^l 


1/2 


and 


"^ij = (v^+Wi+Wj)/[3+2(Pj^i + P^. + 

2 

For a random sample of size n from N(pt,a ) , we have 
Pij = P = -l/Cn-l) for all i and j and X^i = (n-l)/n, for all i* 
Hence for ^ = 0, we have 

C^ij = [(n-l)/C3(n-3)}]^/2, 

and Uj^ij = K(yj^ + yi + Yj - 3y)/s, 
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’where K = [n/{ 3(n-l) (n— 3) 

This gives 

(6.2.3) = K + 7(^-2) ■■ ^ ^N3' 


where Tj^^ is the Murphy's statistic for three outliers. 

Generalization to outliers on the right are now 

Cm^) 

immediate. The statistic U is now defined by 


(mi) 

U = Max u. . 

I<i^<i 2 <...<i^ <n 



f 


where 

^ < 1 T 


w. 

1. 


+ w. 


■¥ w 




[mi 


mi 

+ S 2 
g/^h=l 




jl/2 


Sinilar to the case of three outliers, it is equivalent 
to Murphy's test for rtii outliers for a random sample fran . 
ISifigO ) distribution. The statistic U for two outliers can 
thus be extended for mi outliers, but statistic V cannot be 
extended in this manner. 


6»3. Distribution theory 

We shall discuss the distribution theory of -tliese u's for 
the case of three outliers in detail. Results for the general 
case are analogous. 


From equation (6.2.2) we have 
^ij = ^ij (% + Wi + Wj), 1 < h < i < j < n. 
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where 


(6.3.1) = l/[34.2(Pj,. + 

Now applying Corollary 2 •2*1 with 

^ ” C^ CO/0 / • •*/0/ly0^44r*/0/ljrs ■ ••/I/O/ • • • / O} / 

where 1 occurs in h/i and jth places, and noting that 
C=MRM'^ =1/ we get the marginal pdf of u . . as 

<N> «v> i. nxj 

(6.3.2) f(^hij^ == Cl~u^ij) ^P~^^/Vb[1/2/(p-1)/2]/-1 < < 1 , 

where p = n— k+v. 

It is clear that for general the marginal distribution 
of u . . . will be identical as that of u, . . given at 

equation (6.3.2). Other joint distributions can be obtained 
from Theorem 2.2.1 in a like manner, 

6.4. Percentile points 

/ C ■^V 

Ncxninal upper percentile point ^ for U'’ ^ can be 
obtained by using first Bonferroni inequality. We have 

.(3) 


U' 


Hence r 


Max 

l<h<i<j<n 


^ij 


Pr(U^^^ > u^^^) < 


E PrCuj^y > 4='’) 


l<h<i<j<n 

i 

= (5) PrCu^y > 4=”), 

since marginal distribution of each is the same. Consequently | 
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(3) . . 1, 

a nominal upper lOOa percentage point is given by 


(6.4.1) = 6a/[n(n-l){n-2)] . 

(3) 

Or/ equivalently from Lemma 2# 3.1 for u^ > 0, it is 
given by 

(6.4.2) I /o\2 [(p-l)/2/l/2] = 12a/[n(n-l)(n-2)3 . 

l-u^^ 

CL 

C 3 ) 

Solving equation (6.4,2) for u^ / a nominal upper 

( 3) . 

percentile point for U is obtained. 

Similarly for nv, outliers a nominal upper 100a percentage 
(nu) (m^) 

point Uq, of U is given by 


(6.4.3) 


1-u, 


(m^) 


[ (p— l)/2#l/2l = 2cc/( ) 0 




a 


.(3) 


These nominal upper percentile points u^ for a = 0,01, 
0,05 are given in Table 6.4,1* Similar to Table 2.3,1 these are 
tabulated for y = 0/ n - 5(1)12/ 14(1)16(2)20/21/24,25/27/ 

28(2)32/33/35/36,40/42,45,48(1)50(5)60(10)100 and 

Ic = 1(1) min (i5,n-2). 


Using the relation given at equation (6,2,3)/ nOTiinal 
percentiles for Tj^^ statistic are obtained for n = 10,20,50 
and 100, These are given in Table 6,4.2, Ihese values can be 
compared with the simulated critical points of Tj^^ obtained by 
Barnett and Lewis (1978), From this we find that our values 
agree with their values considerably for n < 50, 
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6«5» Performance of the s-fcatistics 

We now study the performance of test statistic proposed 
in Section 6,2 in the non-null situation \^en three outliers are 
present on the right. The distritaition theoiy results for this 
case are direct generalizations of results for two-outlier case. 
Further, for simplicity we shall consider the case V = O in 
detail, Ihe null hypothesis specifies that there are no outliers 
present/ \niiile the alternative hypothesis is the union of 
hypotheses (l<h< i< j£n), 

6,5,1, DistrilJution Theory 

Without loss of generality, we consider (h/i,j) = 

Then under ^ 23 ' 

y i N(.,a^l) / 

= X g + £l®l + g2®2 + g3®3 ' 

where e„(s = l,2/,../n)/ is the sth coliimn of I_. 

Further 0j_ > O for i = 1,2/3, 

Under the residual vector, e = a y is normally 

JLZ^ CO ^ 

2 

distributed with variance covariance matrix act and mean 

CO 

^^Slh23’ = &'^<X'«123> 

= A [E(ylH„) + + Ejej + 6363] 

^ ^£l®l + £2®2 + 23®3' 

= g ^£l®l + ?2®2 + g3®3> ' 
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2 

since aX = 0, The error sum of squares S has a non-central 

rot CO fO 

2 2 

0 X distritution with n— k degrees of freedon and non— csitrality 
parameter X* t where 


a 


2 




= O'x'+e'e- +e'e,+e'e^) A (X +e„e,+e 0.) 

CO CO CO-l- i CO Z ci C0*5 CO CO CO CO*- CO*i- ^ l>A,0 O 


3 

= s 

i=l 


3 

s 

j=l 


^i ^ j 


(6,5,1) - ?^]^j^0i+>^22®2‘^^33®3'^^^12®1®2‘‘'^H3®1®3'^^^3®2®3* 


From equation (6,2.1)^ we have 

1/2 ^ ^ a1/2 ^ p a 1/2' 


^ij = ^ij^®tAhh ®/^ii ■*■ ^ 


= y (say)/ 


where X^ is the ith coliomn of A / Is given at equation 

( 6 , 3 • 1 ) and 


Note 


that c'c = If and variance of t. . is Further 

.V 3 ro ' nij 


A G = 
fO to 


Hi) 

^2) 


!^(n) 
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ij 


. a,jav^ 


\y^hb * ^nlAlf + ^njA 

m 


c. 

<>J 


H. 


123' 


Since is a linear combination of y , hence under 




where 


M = 


®^^ij'»123> = S' ^'X,IHi23> 

+ ‘’^ 21 /'^?*^ +’^2jAi^ +^2jAj'j^’ ®2 

+ * '^3iAi{^ +A3/A5f ) 03 ] 

(6.5.2) = Cj^y [(Pj^i, + Pj^j^ + Pj^j) + (P2h'*‘'’2l'''‘^23^^22^®2 

+ ^‘’3h+'’3i+%j^ '^ 33 ^ 83 ] • 


Now^ if we let = ‘^ij^ ^ 

% S X^(l/6)/ 




where the non~central ity parameter 6 is given by 


(6.5.3) 0^6 = E(ylit.,J c c' E(yjH. J ^ jLt^, 

^ XZ'^ CO isa» CO X Z '<3 
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and M is given at equation (6,5*2)* 

Next, 02 = 

= y' A y “ y' c y 

CO fN> to rc cv> 

= y' [ A - c c'] y. 

to tvj to 

Now Ca-cc') (A-cc') j=A-Acc'-cc'A4'Cc'’cc^ 

to coco corofo CO rococo rococo 4 ro<o«oro 

= A- Cc' - cc' + GC'' = A— cc'. 

CO coco coco coco fO |N>fO 

Thus A ” c c' is an idempotent matrix* Further 

CO CO fO 

ranlc (a-cc') =:tr(A“C!c') 

«o CO CO CO CO ro 

= tr (a) - tr ( c c') = n-k-1, 

CO CO «0 

d. 2 2 

Consequently, Q 2 = o X (n— k-l,il), vjhere the non-centrality 
parameter is given by 

(6.5.4) O^ri = E(y'|H. _-)(A- C c^) E(y|H_J 

CO iZ'O CO CO CO CO 

2 2 
= - a^6 r 

2 2 

where a X* and 0 6 are given at equations (6.5*1) and (6.5.3) 
resp ect ively * 

Finally and are independent since 

(A-CC^)c = AC-CC^ C = C- C=:0. 

CO fOCOfO CO^ co coco CO fO 

We thus have a result analogous to Lemma 5.2*1. ®ie distribution 
of 

^ij “ * ^2^^^^ 

is now obtained frtxn Theorem 5*2*1, with ju # 6 and tj given by 
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ec^aations (6.5.2), (6.5.3) and (6.5.4) respectively. ISiis is 
given (with > O, etc.) by 

f(u) = E K- S kJ" u^Cl-u^)^''*‘^2"’VBC(i+l)/2, j+a/2]/ -1 i u < 1, 
j =0 i=0 

where 


K^. = (V2)Vj! / j =0,1,2,..., 

Kf‘^= (6/2)^^VG(i/2+l), i =0,1,2,..., 

2 2 

a = n-k-1 and 6 = JJ- /<^ . 


( 3) 

Using this, Pr ^ ^*^123^ etc, can now be evaluated 

exactly as well as approximately by applying the techniques 
discussed in Sections 5.2 and 5.3. 


6.5.2. Measures of performance 


Similar to the case of two outliers, we consider the 
following measures of performance. 


^hij = ^ 

= Pr(u' 3 ) > 43), 


P 


a 


Min 

l<h<i<j<n 



and 


Min 

l<h<i<j<n 


Qh 


ij 


If does not depend on h, i and j, then P^^ = ^123 ^ 

the probability that ^^^23 significantly large t^Aien Hj^23 
true .Similarly if C^j^j does not depend on h,i and j, then 
= 0 j ^23 power function. 
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For the remainder of this section^ we shall confine our 

attention to the case of a random sample fi^om a normal population* 

(3) . 

As shown in equation (6.2*3), now the test statistic U is 
equivalent to Murphy's test statistic for 3 outliers* Further 
does not depend on h/i and j. Consequently, = ^123* 
Further = (n— l)/n = X and P^j = -l/(n-l) = P for i j* 

The parameters involved in the evaluation of P ]_23 ^t'e obtained 
from equations (6.5.3) and (6.5.4), with h = 1, i = 2 and j = 3 
in these equations. These are given ty 

6 =X [(1+2P)/3] ( 03 ^ 462463 )^ 

5= f(n— 3)/3n3 ( 6 j^+ 62 '*^ 3 ^ ^ 

and 

r? =x[ 2 (l-p)/ 3 ] ( 0240^461 - ©102 - 0163 - ©203) 

= ( 2 / 3 ) (0^402+01 - 01^2 ~ ®1®3 ~ ®2®3^ • 

The evaluation of P ^23 Murphy's test, although straight 

fo 3 cward, is rather time consuming for all combinations of 03 l -'®2 
and ©3. We therefore consider the case 0j_,= ©2 “ 03 “ and 
tabulate ^ 2.23 ^ ~ n = 20 and 50 in Tables 6.5.1 and 

6.5.2 respectively. The values of Pj^23 calculated ty exact 

formula for 0 < 3 and by approximate formula for 6^4. As can 
be seen from these tables, the performance is quite good if 3 
outliers are really present. For the sake of comparison, we also 

tabulate ^ 3 ^ 23 '' when (i) 6 -^ ~ ®2 = 0' 03 (ii) 0^^ ~ 0# 

®2 “ “ 0 ^ last two columns. The first case corresponds 





the situation vrihen only two outliers on the right are present 
and Murphy's test for 3 outliers is applied. In this case the 
performance is not very good especially for n = 20, However 
for n = 50, and 6 > 6 , the test performs reasonably well. 

The second case with = 6 and ©2 - 63 = O, corresponds 
to the situation when there is only one outlier and a test for 
3 outliers is applied. For this case the performance is extremely 
poor even for n = 50, 


Next consider the measure = C^23* d®arly — ^23' 

that is < Q^, This gives a lower bound for Q^, An upper 

(3) (3) 

bound for = Pr(U ^ *^23^' obtained by applying 

the first Bonferroni inequality. Thus 


where 


« Min ( 1 / 2 3^ * 




Since the number of terms to be added is now considerably 
larger than the number of terms for two— outlier case, hence this 

I 

bound is not very useful, except for very small values of n, I 

I 

We therefore evaluate Q^, by simulation using lOOO iterations 
for n = 20 and 50, a ss 0.05 and for the case ©^^ = ©2 “ ®3 “ ■ 

In Tables 6,5.1 and 6.5.2, we tabulate simulated values of ! 

I 

and in third and fourth columns respectively. 

Note that the simulated values of P^, agree considerably with; 
the theoretical values. The values of are close to for ; 

larger values of ©• However, these are not so close for small i 

values of 6 , as P^ only provides a lower bound for Q^. i 



195 


(3) 

TABLE 6.4.1. Nominal upper critical values of one-sided 

test statistic for three outliers in linear 

regression 


(g = 0.05) 


n 1 k 

1 

2 

3 

4 

5 

6 

7 

5 

.9587 

.9900 

.9999 





6 

.9417 

.9740 

.9950 

1.0000 




7 

.9248 

.9560 

.9321 

.9971 

1.0000 



3 

.9035 

.9 379 

.9653 

.9870 

.9982 

1.0000 


9 

.8929 

.9203 

.9473 

.9717 

,9900 

.9988 

1.0000 

10 

.8780 

.9036 

.9294 

.9544 

.9763 

.9922 

.9992 

11 

.3639 

.8877 

.912? 

.9366 

.9599 

.9798 

,9937 

12 

.8505 

.8727 

.8958 

.9193 

.9425 

.9643 

.9825 

14 

.8257 

.8452 

.8656 

.8367 

.9085 

.9303 

.9515 

15 

.3142 

.8325 

.8517 

.8717 

.8924 

.9136 

.9 347 

16 

.8033 

.8205 

.8386 

.8575 

-8772 

.8974 

.9181 

18 

.7828 

.798 3 

.8144 

.8 313 

.8490 

.8674 

.8864 

20 

.7642 

.7781 

.79 2 6 

.8078 

.82 37 

.8404 

.8577 

21 

.7554 

.7686 

.7825 

.7969 

.8121 

.8279 

.8443 

24 

.7311 

.7427 

.7547 

.7672 

.7803 

.79 40 

.8082 

25 

.7236 

.7347 

.7462 

.7582 

.7707 

.78 37 

.79 73 

27 

.7095 

.7196 

.7302 

.7412 

.7527 

.7646 

.7770 

28 

.7027 

.7125 

.7227 

.7332 

.7442 

.7557 

.7676 

30 

.6899 

.6990 

.7084 

.7132 

.7283 

.7389 

.7498 

32 

..6779 

.68 64 

.6951 

.7042 

.7136 

.7233 

.7335 

33 

.6722 

.6804 

.6888 

.6976 

.7066 

.7160 

.7258 

35 

.6613 

.6689 

.6768 

.6850 

.6934 

.702? 

.7112 

36 

.6560 

.6634 

.6711 

.6790 

.6871 

.6956 

.7043 

40 

• 6365 

.6430 

.6498 

.6567 

.6639 

,6713 

.6790 

42 

.6274 

.6336 

.6400 

.6466 

.6533 

,6603 

.6675 

45 

.6147 

.6204 

.6263 

.6323 

.6386 

.6449 

.6515 

48 

.6028 

.6081 

.6136 

.6192 

.6249 

.6308 

,6368 

49 

.5991 

.6042 

.6095 

.6150 

.6206 

.6263 

.6322 

50 

.5954 

.6004 

.6056 

.6109 

.6164 

.6220 

.6277 

55 

.5780 

.5825 

.5872 

.5919 

.5967 

,6017 

.6067 

60 

.5624 

.5664 

.5706 

.5748 

.5791 

.5836 

.5881 

70 

.5350 

.5384 

.5418 

.545 3 

.5489 

.5525 

.5562 

80 

.5119 

.5147 

.5176 

.5206 

.5235 

.5266 

.5297 

90 

.4919 

.4943 

.4968 

.4993 

.5019 

.5045 

.5072 

100 

.4743 

.4765 

.4787 

.4809 

.4831 

.4854 

.4877 
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TABLE 

6,4.1 

. Contd. 

» 

(c 

= 0.05) 





nl k 

8 

9 

10 

11 

12 

13 

14 

15 

10 1 

.,0000 








11 

.9994 

1.0000 







12 

.9948 

.9995 

1 .0000 






14 

.9708 

.9864 

,9963 

.999 7 1 

.oooo 




15 

.9550 

.9733 

.9879 

.9968 

.9998 

1 .0000 



16 

.9335 

.9581 

.9755 

.9891 

.9972 

.9998 

1.0000 


18 

,9059 

.9256 

.9449 

.9631 

.9789 

.9909 

.9978 

.9999 

20 

.3756 

.8941 

.9129 

.9317 

.9500 

.9670 

.9816 

.992 3 

21 

.8615 

.8792 

.8974 

.9159 

.9343 

,9522 

.9637 

.9827 

24 

.82 31 

.8386 

.3547 

.8713 

.8885 

.9059 

.9236 

.9410 

25 

.8115 

.3263 

.8417 

.8577 

.8742 

.8911 

.9084 

.9258 

27 

.7900 

.8035 

.8176 

.8322 

.8474 

.8631 

.3793 

.8960 

28 

.7300 

.7929 

.8064 

.8204 

.8349 

.8500 

.8656 

.8817 

30 

.7612 

.7731 

.7854 

.7982 

.8116 

.8255 

.8399 

.8543 

32 

.7440 

.7549 

.7663 

.7781 

.79 03 

.8031 

.8164 

.8301 

33 

.7359 

.7464 

.7573 

.7686 

.7804 

.7927 

.8054 

.8186 

35 

.7206 

.7303 

.7404 

.7509 

.7618 

.7731 

.7848 

.7970 

36 

.7134 

.72 27 

.7325 

.7426 

.75 30 

.7639 

.7752 

.7369 

40 

.6869 

.6951 

.7035 

.7122 

.7213 

.7306 

.7403 

.7504 

42 

.6749 

.6826 

.6905 

.6986 

.7071 

.7158 

.7248 

.7342 

45 

.6583 

.6653 

.6725 

.6799 

.6875 

.6954 

.7036 

.7120 

43 

.6431 

.6495 

.6561 

.6628 

.6698 

.6770 

.6845 

.6921 

49 

.6383 

.6445 

.6509 

.6575 

.6643 

.6713 

.6785 

.6859 

50 

.6336 

.6397 

,6459 

.6523 

.6589 

.665 7 

.6727 

.6*799 

55 

.6119 

.6173 

.6227 

.6284 

.6341 

.6400 

.6461 

.6523 

60 

.5927 

.5975 

,602 3 

.6073 

.6124 

.6176 

,6230 

.6284 

70 

.5600 

.5638 

.5677 

.5717 

.5758 

,5800 

.5343 

.5887 

80 

.5 328 

.5 361 

.5393 

.5426 

.5460 

.5495 

.55 30 

.5566 

90 

.5099 

.5126 

.5154 

.5182 

.5211 

.5240 

.5269 

.5 300 

100 

.4900 

.4924 

.4948 

.4972 

.499 7 

.5022 

.5048 

.5073 
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TABLE 6*4.1. Contd, 

(a = 0.01) 

nik 1 2 3 4 5 6 7 


5 

.9859 

.9980 

1.0000 





6 

.9741 

,9911 

.9990 

1 0000 




7 

,9608 

,9804 

.99 39 

.9994 

1 .0000 



8 

.9470 

.9676 

,9845 

.9955 

.9996 

1.0000 


9 

,9332 

.95 38 

.9725 

.9874 

.99 66 

.9998 

1.0000 

10 

.9195 

.9397 

.9590 

.9761 

,9894 

.9973 

.9998 

11 

.906? 

,9258 

.9451 

.9632 

.9790 

.9910 

.9978 

12 

.8933 

.9122 

.9 311 

.9495 

.9666 

.9813 

.9922 

14 

.8639 

.8862 

.9039 

.9217 

.9394 

.9563 

.9718 

15 

.8574 

.8739 

.8909 

.9082 

.9256 

.942 7 

.9590 

16 

.8464 

.8621 

.8784 

.8951 

.9120 

.9290 

.9457 

18 

.8256 

.8400 

.8549 

.8703 

,8861 

.902 3 

.9186 

20 

.8064 

,8196 

.8333 

.8474 

.8621 

,8771 

.8926 

21 

.79 7 3 

.8100 

.82 31 

.8367 

.8507 

.8652 

.8802 

24 

.7721 

.7833 

.7949 

.8069 

.8194 

.8323 

.8456 

25 

,7643 

.7751 

.7862 

.7978 

.8097 

.8221 

.8350 

27 

.7494 

.7594 

.7698 

.7805 

.7916 

.8030 

.8149 

28 

.7424 

.7520 

.7620 

.7723 

.7830 

.7940 

,8055 

30 

.7289 

.7379 

.7472 

.7568 

.7667 

.7770 

.7876 

32 

.7162 

.7246 

.7333 

.7423 

.7516 

.7611 

.7710 

33 

.7102 

.7183 

.7267 

.7354 

.7444 

.75 36 

.7632 

35 

.6986 

.7063 

.7141 

.7223 

.7307 

,739 3 

.7483 

36 

.6931 

.7005 

.7031 

.7160 

.7241 

.7325 

.7412 

40 

.6723 

.6789 

.6857 

.6927 

.6999 

.7073 

.7150 

42 

.6627 

.6690 

.6754 

.6820 

.6888 

.6958 

,7030 

45 

.6492 

.6550 

.6609 

.6670 

.6733 

.6797 

.6864 

48 

.6365 

.6419 

.6474 

.65 31 

.6589 

.6649 

,6710 

49 

.6325 

.6373 

.6432 

.6487 

.6544 

.6602 

.6661 

50 

.6286 

.6337 

.6390 

,6444 

.6499 

,6556 

,6614 

55 

.6101 

.6147 

.6194 

.6242 

.6291 

.6342 

.6393 

60 

.59 34 

.5975 

.6018 

.6061 

.6105 

.6151 

.6197 

70 

.5642 

.5676 

.5711 

.5747 

.5784 

.5821 

.5859 

SO 

.5 394 

.5423 

.545 3 

.5434 

.5515 

.5546 

.5578 

90 

.5180 

,5206 

.52 31 

.5258 

.5234 

.5311 

.5338 

100 

.4993 

.5015 

.5038 

.5061 

.5084 

.5107 

• 5131 
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TABLE 6.4.1. Contd . 


(a r = 0.01) 


n \ k 

s 

9 

10 

11 

12 

13 

14 

15 

10 

1 ,0000 








11 

.9998 

1 .0000 







12 

.9982 

.9999 

1 ,0000 






14 

.9847 

.99 39 

,9987 

.9999 

1,0000 




15 

.9738 

.90 60 

.9# 46 

,9989 

1 .0000 

1.0000 



16 

.9614 

.9756 

.9871 

.9951 

,9990 

1.0000 

1,0000 


18 

.9348 

.95 06 

,9654 

.9785 

.9889 

.9960 

.9993 

1,0000 

20 

.9083 

.9240 

.9396 

,9546 

,9686 

.9808 

,9903 

.9966 

21 

.8954 

.9109 

.9264 

,9417 

.9564 

.9700 

.9817 

,9909 

24 

,8594 

.8735 

.8880 

.9027 

,9176 

.9325 

.9470 

.9608 

25 

.8482 

.8619 

.8759 

.8903 

,9049 

,9196 

.9342 

.9485 

27 

.8272 

.8399 

.85 30 

.8665 

.8303 

.8944 

,9037 

.923 i 

28 

.8173 

.8295 

,8422 

.8552 

.8686 

,8823 

.8963 

.9105 

30 

.7986 

.8100 

.8217 

.8339 

,8464 

,8593 

.8725 

,8860 

32 

.7813 

.7918 

,3028 

,8141 

.8258 

.8378 

.8502 

.8630 

33 

.7731 

.7833 

,7938 

.8047 

.8160 

.8276 

.8396 

.8520 

35 

.7575 

.7670 

,7769 

.7870 

,7976 

,8084 

.8196 

.8312 

36 

.7501 

.7593 

,7688 

.7787 

.7888 

,799 3 

.3101 

.82 i 3 

40 

,7229 

,7310 

.7393 

.7479 

.7568 

.7660 

.7755 

.7852 

42 

.7105 

.7181 

.7259 

.7340 

.7424 

.7510 

.7599 

,7690 

45 

.69 32 

.7002 

,7074 

.7148 

.7224 

.7302 

,7333 

,7467 

48 

.6773 

,68 37 

.6903 

.6972 

,7041 

,7113 

,7137 

.7264 

49 

,6723 

.6785 

.6850 

.6916 

,6984 

.7054 

.7126 

,7200 

50 

.6674 

.6735 

,6798 

.6862 

.6928 

.6996 

,7066 

.7138 

55 

,6446 

.6500 

.6556 

.6613 

.6671 

.6731 

.6792 

.6855 

60 

.6244 

,6292 

.6342 

.6392 

,6444 

.6497 

,6551 

.6606 

70 

.5898 

.5937 

.5977 

.6018 

,6060 

.6103 

.6146 

.6191 

80 

.5610 

.5643 

.5677 

.57 il 

.5746 

.5781 

.5817 

,5854 

90 

.5366 

.5394 

,5423 

.545? 

.5482 

.5512 

.5542 

.5573 

100 

.5155 

.5180 

.5205 

.5230 

.5255 

.5231 

.5 307 

.5 334 
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Table 6 . i , 2 • Com-oarlson of nominal and simulated critical 
values for T^.j^. 


n 

a 

= 0.05 

a 

= 0.01 

Nominal 

Tafaulatod 

Nominal 

Tabulated 

10 

3. 329 

3.32 

4.099 

4.00 

20 

5.344 

5.30 

5.639 

5.60 

50 

6.996 

6.32 

7.386 

7.34 

100 

8.048 

7.77 

8.473 

8.27 
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TABLE 6.5.1. The probability P_ and Q_ for Murphy's 
test for n = 20 and a = 0.05 . 


0 

CD 

II 

CD 

It 

CD 

1 


^2 ^ 

=6,02 =©3=0 


p 

a 

simulated 

Qa 

simulated 

a 

P 

a 

0 

0.000 

0.000 

0.048 

0.000 

0.0000 

1 

0.004 

0.004 

0.051 

0,001 

0,0002 

2 

0.087 

0.090 

0.181 

0.008 

0.0006 

3 

0.452 

0.460 

0,538 

0.030 

0.0010 

4 

0,866 

0.879 

0.903 

0.074 

0.0016 

5 

0.991 

0.991 

0.991 

0.132 

0.0015 

6 

1.000 

1.000 

1.000 

0.199 

0,0012 

7 

1.000 

1.000 

1.000 

0.266 

0.0008 

8 

1.000 

1,000 

1,000 

0.331 

0.0005 

TABLE 6.5.2. 

The probability and for Murphy's test 



for n = 5C 

1 and 0=0.05 









Pi 

01=62 

it 

CD 

1 


6^ =©2 =6 #0^=0 

^ ^2 3 

o 

^a 

^a 

simulated 

Qa 

simulated 

^a 

P 

a 

0 

0.000 

0.000 

0.032. 

0.000 

0.0000 

1 

0.001 

0.000 

0.041 

0.000 

0,0000 

2 

0.060 

0.056 

0.19 3 

0,004 

0.0001 

3 

0.476 

0.503 

0.618 

0,035 

0.0005 

4 

0.928 

0.924 

0.953 

0.152 

0.0022 

5 

0.999 

0.997 

0.998 

0.381 

0.0044 

6 

1.000 

1 .000 

1.000 

0.650 

0.0077 

7 

1 .000 

1,000 

1.000 

0.850 

0.0120 

8 

1,000 

1.000 

1,000 

0.951 

0,0170 
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SOLUTION OF I^(p,l/2) = r 

Here we give an algoritiun for niomerical evaluation of the 
inccr-plete beta function Ij^(p,l/2)/ v^ere p is a multiple of 
half. Then using this we develop an algorithm for inverse of 
incomplete beta function by numerical iteration procedure for 
small values o± 7 , 0< 7 < l. The calculations are performed 
on DEC 1090 Computer system. 

a. Numerical evaluation of incomplete beta function 

The incomplete beta function Ij^Cp/q)/ is calculated using 
the recursive formula (Bremner/ 1978) 

(A.l) Ij^(p,q) = Ij^(p-l/q) - h^'"^(l-h)‘V[(p+q“l^ B(p,q)l , 

where p > 1, q > 0 and 0 < h < 1. For our purposes, we need 
Ij^(p,q) for q = 1/2 and p in multiples of half. 

The recursion formula is started with 
h 

Ijj(l/2,l/2) = / t~^/^ (1-t)*"^/^ dt/B(l/2,l/2) 

o 

= (2/Ji) Sin"*^ (h^'^^), and 

Ij^Cl,l/2) = / Cl-t)*^^^ dt/B(l,l/2) 

o 

= l-(l-h)^/^. 

Equation (A,l) along with these initial values give 
Ijj(p,l/2) recursively for all values of p which are multiples 


of half. 
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b« Evaluation of inverse of incomplete beta function 
The solution of equation 


B (p - i/2) f dt = r , 

where p is a multiple of half and T > O is small/ is obtained 
by n-umerical iteration, For an initial value of x, we partially 
integrate (A, 2 ) to get 


p-Bri:i/2) IpTFJW i dt = r . 

For small values of Y / the solution x of equation (A,2) 

is close to zero. Neglecting the second term on the L.H.S, 

••1 /*? 

of equation (A, 3) and approximating (l-x) ^ to 1, we get 


X? = Y,p, B(p,l/2). 


This gives an initial value 


(A. 4 ) x^ = [r.p,B(p/l/2)}^'^^. 

In general 3 ^ gives an ''over estimate^ of true value and 
requires a larger niimber of iterations. A slightly better 
approximation is obtained by neglecting the second term of 
equation (A, 3 ) and considering 

xP(l-x)~^/^ ^ 

p B(p,l/2) ■* ^ • 

On raising it to power 1/p/ we have 

X = [r.p.B(p/l/2)]^'^P (l-x)^'^^P. 
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Using equation (A. 4), and substituting for x in the 
factor (l- we get 

X s x„(l-x^)l/2p. 

Our calculations show that, in general, this value of x 
is an under estimate. But 

(A, 5) Xj^ = 3<^(l-x^2p) 

gives best approximate value, even for small values of p. 

This is used for starting the iteration procedure. 

The final solution is obtained by Nexwton-Raphson method 
using the relation 

(A, 6) Xj^ = ^i-1 “* ^ 

where 

f(x) = i^(p,l/2) - r . 

The iteration is terminated when the absolute difference 
between two successive iterations is not greater than the 
required accuracy. 

The function subprogram XBETA calculates the inccanplete 
beta function I^Cp,l/2) for a given value of p and h. For 
this subprogram, the complete beta functions are supplied from 
the main calling program. Uius, this program can calculate the 
incomplete beta function for any value of h in (0,1) and p, a 
multiple of half with p < 100. 
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The function subprogram XINBTA calculates the inverse of 
incomplete beta function for a given value of p and f • The 
error is indicated by IR -which is initially assigned the value 
zero. It is equal to 1 if the iteration procedure does not 
converge in 20 steps, and is equal to 2 if the solution at any 
stage is greater than 1, The final solution is V* It is equal 
to if the iteration converges in n steps ,oth exvise it is 

equal to X 2 ]_» ^or checking the convergence of the iteration/ 
the accuracy ACU is taken to be equal to 10*”^# 


L/JSIGUAGE 


Fortran 10 


STRUCTURE 

SUBROUTINE XINBTA(CB,P,PROB/V/ IR) 


Formal parameters 

CB Real input 

vector of 
length 200 

P Real input 

PROB Real input 

V Real output 


This vector specifies the complete 
beta functions B(p/l/2) for 

p = 0.5(0.5)100, 

This is the first parameter 'p' 
of the incomplete beta function. 

This is the desired probability T • 

This is the final value given by the 
iteration procedure. (In the ith step/ 
this is the value equal -to 
^i+1 ~ 1/2/. ../n). Let n be the 

number of iterations required for 
convergence to achieve desired 
accuracy/ then V =s otherwise 

V IS 1 ^ * 
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IR Integer output 


Auxiliaj 


tlloorithm 


Error indicator 

= 1 if the iteration does not converge 
= 2 if IXj^l > 1 for some i = 1/2/... /n. 
= 0 otherwise. 


Subroutine XINBT/i calls subroutine XBETA. 

ACCURx^CY 

The program XINBT^'i gives 'accurate values upto 5 decimal 
places. The accuracy can be increased upto 15 decimal places 
by assigning the value aCU, appearing in the program/ equelL to 
the necessary accuracy requirement. But in that case the number 
of iterations will have to be suitably relaxed. 
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C 

SUBROUTINE XINBTA(CB,P,PROB,V^ IR) 

C 

C THIS SUBROUTINE CALCULATES THE INVERSE OF INCOMPLETE 

C BETA FUNCTION, WHEN THE SECOND PIUL^IETER OF THE BETA 

C FUNCTION IS H.iLF ATD IHE FIRST Px^RTiMETER IS A MULTIPLE 

C OF Hx'iLF 

C 

EXTERN/iL XBETA, DSQRT, DABS , DEXP , DLOG, DATAN 
DOUBLE PREC IS ION X( 5 0 ) , CB ( 2 OO) , F, FI , XBETA, P , B1 , BETA, H, 
lPROB,ACU,VyPl 
C 

C N IS THE NUMBER OF ITERTi-TIONS 

C 

N=:20 

C 

C INITIx'iLIZE CONSTfiNTS 

C 

ACU=0. 000001 

IR=0 

P1=P+P 

I?1=P1 

B1=CB(IP1) 

G 

C CALCULiiTION OF INITIiiL APPROXIMATION 

C 
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X{ 1) =DEX?( ( DLOG( PROS^' PifBl ) ) /P) 

X(l)=X(l)*(l-X(l)/?l) 

1=1 

C 

C SOLVE FOR V USING NEWlON-R/iPHSON MEIHOD, USING IHE FUNCTION 

C XBETA 

C 

c 1/Fl DENOTES THE DERIV.iTIVE OF THE FUNCTION F 

C 

3 H=DABS(X(I)) 

IF(H.GE.l)GO TO 31 
F=XBETA(CB,P,H)-?RDB 

FI =DEXP ( DLOG( B1 ) - ( P-1 ) *DLOG( X( l) ) +( DLOG( 1 -X( l) ) ) /2 ) 

X( I+1)=X( l)-F-::-Fl 

IF(DABS(X(I+1)-X(I) )/X(l) .LE.ACU)GO TO 20 
IF(I,GT.N)G0 to 30 
1=14-1 
GO TO 3 
20 V=X(I4-1) 

GO TO 10 
C 

C IHE FOLLOWING DEFINES THE ERROR INDICATOR IR 

C 

30 IR=1 

GO TO 10 

31 IR=2 
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10 REIURN 
END 
C 
C 

DOUBLE PRECISION FUNCTION XBETA(CB,P,H) 

C 

C THIS PROGRTiM Gc'UXULATES THE INCOMPLETE BETA FUNCTION 

C UPTO THE POINT H WITH PARAMETERS P AND HALF 

C 

EXTERNAL DSQRT,DATAN 

DOUBLE PRECISION IHBETA( 5000) ,CB( 200) ,H,P,PI<C, RH, IHB, 

1 B1 , PI , DAS IN, P2 , ACU, TERM 
C 

C INITIALISING CONSTANTS 

C 

P1=0 

PI=3.14159265 358979 32 
ACU=0.1D-16 
11=0 
C 

C CALCULATION OF INITIAL VALUES FOR THE RECURRENCE RELATION 

C 

8 11 = 11+1 

Pl=Pl+0.5 

P2=P1+P1+0.1 


IP2=P2 
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B1=CB( IP2) 

RH=DSQRT(H) 

c=dsqrt(i-h) 

IP(P1.EQ.0.5)G0 to 5 
IFCpI.EQ.I .0)G0 TO 6 
GO TO 20 

5 IF(RH.EQ.1)G0 TO 7 
DASIN=rDATAN(RH/DSQRT(l-RH*RH) ) 

GO TO 9 

7 DASIN=pi/2. 

9 IHBETA(I1)=2-::-DASIN/PI 
GO TO 10 

6 IHBETA(I1)=(1-C) 

GO TO 10 

C 

C CALCULATION FOR HIGHER VALUES OF P USING RECURRE2TCE 

C RELATION 

C 

20 IHB=IHBETA( 11-2 ) 

TEIM=DSQRT((H-;hK25«-P1-2) )-^1-H) )/((P1-0.5)*B1) 
IHBETAC II) =IHB-TERM 
IF(TERH.LTOACU)G0 TO 17 

10 IF(P1,LT.P)G0 TO 8 

1 7 XBETA=:IHBETA( II ) 

RETURN 


END 
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C 

C CiyijLING PROGRAM 

C 

c 

c cAlLculation of inverse of incomplete beta 

C FUNCTION 

c 

DOUBLE PRECISION CB( 200) ,F,PROB, V,H 
C CALCUL-fiTION OF COMPLETE BETA FUNCTIONS 

C 

CB(1) =3.1415926535897932 
CB(2)=2.0 
DO 10 1=1,198 
P=DFLOAT( l)/2 . 

10 CB( 1+2) =(P/( P+0.5) )-''=-CB( I) 

C 

C Cx'iLCULi^iTION OF THE MAIN RESULT FOR A GIVEN 

C ? AND PROS 

C 

?=6.5 

PRO 8=0,0004928 

CALL XINBTA(CB,P,PR0B,V,IR) 

IF(IR) 21, 20,21 
21 IP(IR-2)14,15,14 

14 PRINT 100 

100 POIM/iT(//10X,'THE ITERATION DOES NOT CONVERGES') 
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GO. TO 500 
15 PRINT 200 

200 F0RMAT(//10X,'H is greater TEiTiN l') 

GO TO 500 
20 PRINT 300, V 
300 FORM.\T(//10X,F8.5) 

500 STOP 
END 



